PortalParts.com Site

 Forum Index > Community > General Chat New Topic Post Reply
 Geeklog/PHP and Search Engine Optimization
 |  Printable Version
ronack
 April 02 2005 07:24 AM (Read 8843 times)  
Forum Full Poster
Full Poster

Status: offline

Registered: 08/22/03
Posts: 10

I open this thread to discuss Search Engine Optimization.

I have been dealing with SEO with a couple of company sites that I administer and have had long discussions with a friend who has a National Public Safety site. I also subscribe to several SEO Newsletters. I wouldn't call myself a SEO professional but I'm getting close.

A few of the hot topics are dynamic website using PHP and ASP, another is the way you do menus on your site. Combine the two and getting you pages indexed by the search engines can be difficult to impossible.

Issues with php:
1. Google and others have a hard time indexing a URL with a question mark "?" and with an equals sign "=".

2. They also don't like any arguements greater than 3.

In looking at my log files when a SE hits my site, I will see the bot grab a page then it leaves the site. I don't know if it indexes that page or not. But then it comes right back and grabs another page and leaves. It will do a half a dozen pages or so then leave and not come back. I expect that there must be a time out for how long it stays on a site. Needless to say on a site with some 500 pages, they are not getting indexed. When a bot hits your site it should easily spider through all the pages that you allow it to and not remain on the site more than a few minutes. The MSN bot however stayed on my site for close to an hour hogging bandwidth until it finally took down the site.

There are a few ways to get around this, one is the mod_rewrite that is available in Apache. However that is only on one platform and I don't user Apache so I had to come up with other answers.

Issues with menus:
I always thought that a good Javascript menu would work just fine for indexing webpages but realized after several months of spidering my site that Search Engines don't like JS menus all that well. Especially if you have them in an external file.

Oh and a side issue:
Search Engines don't tend to spider the entire page. They will only do about 1/4 to 1/2 of the page depending on it's size. This means that things at the bottom of the page may not get spidered. Sometimes they will index the whole page but only pages with very little content.



 
Profile Email PM
Quote
Blaine
 April 02 2005 09:21 AM  
Forum Admin
Admin

Status: offline

Registered: 03/01/02
Posts: 3576

I have it in my list to have glMenu generate a SEO friendly and Non-JS menu using the tags. There is a great article on the Milonic site here

Have you tried any of the available mod_rewrite modules for IIS?


Please consider contributing to support my efforts ..
 
Profile Email Website PM
Quote
Blaine
 April 02 2005 11:37 AM  
Forum Admin
Admin

Status: offline

Registered: 03/01/02
Posts: 3576

I've just spent the last 2 hours trying to find a solution to the problem of using staticlinks on IIS. Passing a URL like http://www.site.com/myscript.php/parm1/value1 will result in a 404 - page not found.

This has been reported on geeklog.net as well if you enable the $_CONF['url_rewrite'] option which the staticpages plugin will use. Once enabled, you can not access the staticpages.

This is an issue only with IIS and certain PHP releases. My desktop is running WinXP Pro + SP2 + IIS 5.1 + PHP 5.0.2 and it works just fine.

My laptop is running the same but with PHP 4.3.10 and I get get 404's. Try using the admin/install/info.php script and pass it a few additional values separated by a /.

IIS should stop parsing the path at info.php and then place the remaining URL in a variable called PATH_INFO but this variable which should only exist if there is extra path info is being set all the time to the full path.

There is a MS TechArticle here but setting this IIS MetaData Property does not appear to have any effect even after restarting IIS.

I also found this article helpfull to explain the IIS Metadata Change but in the end had no effect.


Please consider contributing to support my efforts ..
 
Profile Email Website PM
Quote
ronack
 April 02 2005 15:12 PM  
Forum Full Poster
Full Poster

Status: offline

Registered: 08/22/03
Posts: 10

I don't use IIS either, I am using Xitami. Xitami is much faster than IIS and Apache. More secure than IIS and probably easier to use than Apache.

Of course you probably know about TinyURL which takes a long complicated URL and make a shortURL out of it. What I did was create basically the same thing. I'm not sure how the search engines will react just yet but we'll see.

What I did was take a URL like this
http://www.causewaylighting.com/index.php?topic=residential_bath&menu=residential_bath_

and the short URL for it looks like this.

http://www.causewaylighting.com/shorturl/bath_fixtures.php

Now when I enter the shorturl it actually pulls up the long url. Try it out.

It was very simple to create, I hope it works for the search engines


 
Profile Email PM
Quote
ronack
 April 02 2005 15:48 PM  
Forum Full Poster
Full Poster

Status: offline

Registered: 08/22/03
Posts: 10

Here is an example of one of the Search Engines hitting my site. It grabs a page then leaves, comes back grabs another page then leaves. Notice the time between grabbing pages. Almost 10 minutes between grabs, all in all it took almost an hour to grab 4 pages. And those were the only 4 pages it grabbed.


84.104.217.38 - - [01/Apr/2005:15:47:57 -0500] "GET /robots.txt HTTP/1.0" 200 257
84.104.217.38 - - [01/Apr/2005:15:48:02 -0500] "GET /advt/index.php?cat=50 HTTP/1.0" 200 26810
84.104.217.38 - - [01/Apr/2005:15:56:50 -0500] "GET /robots.txt HTTP/1.0" 200 257
84.104.217.38 - - [01/Apr/2005:15:56:53 -0500] "GET /advt/index.php?cat=20 HTTP/1.0" 200 23261
84.104.217.38 - - [01/Apr/2005:16:34:32 -0500] "GET /robots.txt HTTP/1.0" 200 257
84.104.217.38 - - [01/Apr/2005:16:34:36 -0500] "GET /advt/list.php?cat=48 HTTP/1.0" 200 16198
84.104.217.38 - - [01/Apr/2005:16:36:40 -0500] "GET /robots.txt HTTP/1.0" 200 257
84.104.217.38 - - [01/Apr/2005:16:36:44 -0500] "GET /advt/index.php?cat=63 HTTP/1.0" 200 17659


 
Profile Email PM
Quote
Content generated in: 0.29 seconds
New Topic Post Reply



 All times are CDT. The time is now 08:46 AM.
Normal Topic Normal Topic
Locked Topic Locked Topic
Sticky Topic Sticky Topic
New Post New Post
Sticky Topic W/ New Post Sticky Topic W/ New Post
Locked Topic W/ New Post Locked Topic W/ New Post
View Anonymous Posts 
Anonymous users can post 
Filtered HTML Allowed 
Censored Content