338.527 views
/
2008-05-19
|
On the internet there are a lot of site owners that hide some of the site's pages or even the entire site from the search engines. You can now find those sites with robots.txt.
Robots.txt is a text file present in the root directory of a site which is used to control which pages are indexed by a robot. If you use the 'disallow' word you can block parts of your sites to be found by search engines.
1. Open http://www.google.com and search after the keyword :
"robots.txt" "disallow:" filetype:txt
2. You will find the robots.txt file from sites that uses disallow command in it.

3. Let's open for example the first site: WhiteHouse. We can see that a lot of pages were made invisible.

4. To open 'forbidden' pages just copy the text from what disallow command you want, without the "text" at the end.

5. Now replace in the browser /robots.txt with your copied text and press Enter. The page will open.

This is the hidden page from WhiteHouse.

Of course you can find more interesting pages, this was just an example.
Now you can be like a modern online Sherlock Holmes. Find what others hid.
|
but before you go and pretend to be a super cool internet detective, you need to know the past tense conjugation of the verb 'to hide', which is HID, not hidded......
| eric says: |
15:38 / 22 jun 2008 |
I'm pretty sure that was more or less a play on words, and he wasn't serious...
The guy is just trying to help us. Give him a break.
| Dan says: |
23:32 / 22 jun 2008 |
If you look at his screenshots his google is in Romanian or something... so english probably isn't his first language. Nice article. Gonna try this out.
| Matt says: |
23:42 / 22 jun 2008 |
I'm more amazed how all these comments are from a month into the future!!!
| Jon says: |
00:12 / 23 jun 2008 |
What are you talking about?? I posted this last year..
| Lor says: |
00:23 / 23 jun 2008 |
Microsoft has redirects on their listed files. Not sure why? Hope they're 301!
Past John, this is Future John. It is one month from now and the world is...different. It took me all week to get here, the last internet cafe on the planet, just to contact you. Now, listen carefully. You must...MUST...stop stumbling. Now! If you fail again...we are all doomed.There's no fate but what we make John. No fate but what we make...
"You can now find those sites with robots.txt."You mean you couldn't do that before? I wasn't aware of any restrictions on viewing that file in the past.
| Strubs says: |
04:01 / 26 jun 2008 |
RULEZ!o/
| claus says: |
09:32 / 26 jun 2008 |
"t3hb0ss says: 12:04 / 22 jun 2008but before you go and pretend to be a super cool internet detective, you need to know the past tense conjugation of the verb 'to hide', which is HID, not hidded......"what matters in this point is the content of the article not the perfection of the grammar. since you read and understood what this guy is trying to say I don't see the problem here. give him a break he's trying to help.
| james says: |
12:21 / 26 jun 2008 |
Actually b0ss, if you want to be that technical its not "HID", its "hid"... "the", not "t3h", and "boss", not "b0ss"...
Cant we all just get along, yeah right,(slaps closest person to me).
| Pravin says: |
16:02 / 27 jun 2008 |
Most of the sites now a days do not allow direct access to the folder using server side securities and Error like " Acess Denied" which will be a problem to view the pages directly.
Cool article now check out my BLOG:)
| Barry says: |
07:10 / 28 jun 2008 |
all grammar nazis shall be bitchslapped with a halibut
Check out the wiki on robots if you really want to know what robot.txt hides. The info is really not hidden or secret, just a server precaution. Either way, this is a neat article. Thanks Robert.http://en.wikipedia.org/wiki/Robots.txt
| Feral says: |
07:24 / 31 jun 2008 |
You're a buch of fucking iodiots. It's no small wonder why our country is so fucked up. You bitches probably voted for those assholes.
| Caleb says: |
14:31 / 04 jul 2008 |
@Feral: Are you assuming that everyone here is from the United States? Not that it really matters, because it seems you just graced us with your presence to be rude and obnoxious. Stop it, go to your room.Good article, but robots.txt doesn't really hide anything. It is, as stated by 'Mmmmm nice', merely a server precaution. There are ways that a server admin would simply restrict access to a file or directory and nobody, including google, would be able to look at it.
| rhea says: |
11:17 / 12 jul 2008 |
Cool. I might use this sometime.
Great post!
| ion says: |
05:00 / 19 jul 2008 |
super tare
DISREGARD THAT I SUCK COCKS!!!!
test
| Agni.. says: |
15:37 / 09 aug 2008 |
agni.ypangarsa@id.panasonic.com.. call me, lets have some fun together all about suck and dick
rugiyanto@nok.com.sgariyanto.dewo@ecogreenoleo.comfajar@wilmax.co.idArius.Wirawan@ap.weatherford.comTatang.Maskumambang@dhl.comahmad.nugroho@sg.schneider-electric.comali.sihombing@id.panasonic.comfachrul.huda@sep.epson.com.sgFahrizalMM@thomson.netfebi@ttec.co.idgustina@rotaryeng.com.sgrika.novita@pgi.panasonic.co.idriswana@id.panasonic.comyuli.astuti@sep.epson.com.sgSherly.Theressa@sg.schneider-electric.comali.fauzi@compnet.co.idit.bbc@batamindo.co.idagni.ypangarsa@id.panasonic.comSupiansyah@pauwels.comIrni.Fitriani@pauwels.comFahmi.Riza@sg.schneider-electric.comlusi@mgbi.panasonic.co.iddwi.agus@perkinelmer.comgatot.p@ptmaruwa.comzaharuddin.lubis@sanmina-sci.comhello?
| Nascar says: |
05:41 / 10 aug 2008 |
this is interesting, I wonder what I will hide.