Software Engineer

My photo
Colombo, Sri Lanka
B.Sc.Special(Hons)in IT, MCTS, MCPD, MCP

Monday, May 27, 2013

Add a robots.txt file to SharePoint 2010

What is robots.txt?

This(robots.txt) is a text (not html) file placed in the root of your site to tell search robots which pages should and should not be visited/indexed. It is not mandatory for search engines to adhere to the instructions found in the robots.txt but generally search engines obey what they are asked not to do.

Web site owners use the /robots.txt file to give instructions about their site to web robots. This is called The Robots Exclusion Protocol.
It works likes this: a robot wants to vists a Web site URL, say http://www.example.com/welcome.html. Before it does so, it firsts checks for http://www.example.com/robots.txt, and finds your robot.txt file.

Creating a Robots.txt

    1. Launch Notepad
    2. Put the following in your robots.txt file:
          

      User-agent: *
      Disallow: /
   
    3. Save the file as: robots.txt 

The "User-agent: *" means this section applies to all robots. The "Disallow: /" tells the robot that it should not visit any pages on the site. 

Adding the robots.txt file to the root of your public anonymous SharePoint site.

The location of robots.txt is very important. It must be in the main directory because otherwise user agents (search engines) will not be able to find it. They do not search the whole site for a file named robots.txt. Instead, they look first in the main directory (i.e. http://www.sitename.com/robots.txt) and if they don't find it there, they simply assume that this site does not have a robots.txt file and therefore they index everything they find along the way. So, if you don't put robots.txt in the right place, don't be surprised that search engines index your whole site.

To do this, you can simply follow my article Add a file to the root of a SharePoint site using PowerShell

Ensure the file is accessible to search engines
  
To ensure the file is accessible to search engines go to your site URL and append "/robots.txt".
      
      Example: http://www.sitename.com/robots.txt

Also you can use Robots.txt Checker to do this.

There are two important considerations when using /robots.txt:

  • robots can ignore your /robots.txt. Especially malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention.
  • the /robots.txt file is a publicly available file. Anyone can see what sections of your server you don't want robots to use.
So don't try to use /robots.txt to hide information.

Thursday, May 23, 2013

Add a file to the root of a SharePoint site using PowerShell

Sometimes you need a file to be right off the root of your Internet facing SharePoint site. Files like robots.txt, sitemap.xml. To do this we’ll take advantage of PowerShell’s ability to use any .NET methods along with the Files collection on each SPWeb in SharePoint.

Copy the following syntax and change the values marked in red color according to your need. In the first one give the full path to your file. In the second place with this file name you are going to save your file in the root.

$fileBytes = [system.io.file]::ReadAllBytes("c:\the\full\path\to\your\file.txt");
$site = Get-SPSite "http://sharepoint1";
$site.RootWeb.Files.Add("robots.txt", $fileBytes, $true);


Open the PowerShell as administrator then just copy and execute.
This will result in a robots.txt located at “http://yourdomain:portifneeded/file.txt”.

Be Happy !!! :)

Schedule a Site Collection Backup with a PowerShell Script - SharePoint 2010

I will demonstrate how to back up a SharePoint 2010 site collection automatically through the use of a Power-Shell script. My challenge is How can I backup a SharePoint 2010 site collection automatically on a regular basis?

Here is the solution.

Step 1.

  First setup a backup location. It can be "C:\Backup\MySite" or a shared location like
  "\\MYPC\Backup"

Step 2.

  Then, verify that you possess the following permissions:
  • SharePoint Farm Administrator.
  • Local Server Administrator on all Web servers.
  • db_owner permission on the content database.
  • Full Control permission on the backup folder.

Step 3.

Download this Power-Shell script for backing up a site collection and change the values according to your environment. Replace "$mySite" with your site collection URL and "$backupLocation" with your backup location. My script name is "BackupSite.ps1".

Download "BackupSite.ps1"

Basically, the above script will do the following:
  • Assign all the needed information to start backup.
  • Try to create a new backup and name the file based on current date.
  • If successful, it will write a success message to the log file. Otherwise, it will log the error/exception.

Step 4.

Download the batch file to run the script from following link. (To create a batch file, open a notepad, add these lines and then save it with ".bat" extension)
Download "RunBackup.bat"

Step 5.
 
Copy both the script and batch file to a folder on the SharePoint Server.

Step 6.

Run the batch file to start backing up the site collection immediately or use Windows Task Scheduler to schedule it.
I have attached both the script and the batch file for your convenience.

Notes:
  • This method works with both SharePoint 2010 Foundation and Server.
  • While performing the backup, the SharePoint site collection will be set to Read-Only to prevent data corruption, so it's recommended that you run this script during off-peak hours.