您好,欢迎来到济南诺商官方网站!
当前位置:首页 -> 新闻中心 -> 要避免开百度蜘蛛的几个陷阱

要避免开百度蜘蛛的几个陷阱

时间:2017.06.22 来源:http://www.weidaoshang.cn

一、Flash模块
I. Flash module
诚然Flash模块确实给人的视觉效果是非常棒的,从这个角度来说其实也是非常有利于用户体验,但很遗憾的是蜘蛛只能抓取一般的HTML代码,最喜欢的是文字信息,而Flash在搜索引擎眼里只是是个干巴巴的链接,根本不好判断里面到底是什么东西,自然也不利于网站的优化。
Of course Flash module really gives the visual effect is very good, from this point of view is also very beneficial to the user experience, but unfortunately the only spider crawl the general HTML code, the love is the text information, and Flash in the eyes of search engines is just a dry link, not a good judge what is inside, nature is not conducive to the website optimization.
二、Javascript脚本
Two, Javascript script
这个基本上和Flash是一个原理了,虽然可以整体增加网站的美感,但是很遗憾搜索引擎抓取不到,而且如果网站有太多这样的JS还会严重影响网站的加载速度,对网站排名不利,这也属于一种比较严重的蜘蛛陷阱。
This basically and Flash is a principle, although you can increase your overall sense of beauty, but it is a pity that the search engine can't grab, but if there are too many websites such as JS will seriously affect the website loading speed, the website ranking is also a disadvantage, the more serious the spider trap.
三、Session ID
Three, Session ID
如果网站采用了Session ID来跟踪用户访问,这也是一个后果非常恶劣的蜘蛛陷阱,因为蜘蛛访问这种网站的时候,不管是什么页面,即使访问的是同一个页面,也会出现不一样的ID,根本就很难判断出哪个url才是主要页面,甚至还会误判为网站有大量页面存在重复的内容,这明显是要予以避免的地方。
If the site uses Session ID to track user access, which is a consequence of very bad spider trap, because when the spider visit this website, no matter what the page, even if the visit is the same page, there will be not the same as ID, simply hard to determine which URL is the main the page, or even mistaken for website has a large number of pages have duplicate content, this is obviously to be avoided.
四、带各种参数的动态URL
Four 、 dynamic URL with various parameters
网站动态的URL越多,网站越会误导搜索引擎,如果设置不当也会造成搜索引擎判断不清到底哪个是正规的页面(原理上跟Session ID相似),而且动态URL也是不利于蜘蛛抓取的,我们应该避免,建议站长手上的网站都要做一下静态化,并且屏蔽掉那些异常参数的url。
The more dynamic URL website, the website more will mislead search engines, if improper setting will cause the search engine to determine not clear exactly which is a regular page (with Session ID similarity principle, and dynamic URL) is not good for spiders to crawl, we should avoid, suggest Adsense on the site to do. Static, and shield those abnormal parameters url.
五、页面是frame框架结构
Five, the page is frame framework structure
过去不少人的网站都是使用的框架结构,这种结构虽然代码精简,而且很方便我们站长来据此来更新维护我们的网站,但是,这种结构蜘蛛很难抓取得到,基本上抓不到这里面的内容,特别是里面有那些重要内容的时候,那更是对优化大大不利的。
In the past a lot of people are using the web frame structure, this structure while streamlining the code, and it is convenient to update and maintain our webmaster according to our website, but this is very difficult to get the structure of spider crawling, basically can not catch the contents inside, especially when there are the important contents of the it is greatly harmful to optimization.
六、必须要登陆才能访问
Six, have to log in to access
这是绝对不可取的,毕竟搜索引擎蜘蛛不是人,没有那么智能,他不会自动填写用户名、密码、验证码,这种情况的页面设置要特别是注意。
This is absolutely not advisable, after all, search engine spiders are not people, not so smart, he will not automatically fill in user names, passwords, verification code, the situation of the page settings, in particular, attention.
七、强制使用Cookies
Seven, the mandatory use of Cookies
原理基本同上,搜索引擎蜘蛛不会智能的按要求去强制的去使用Cookies,那么就会导致页面无法正常显示如此强制使用Cookies只能造成搜索引擎蜘蛛无法正常访问。
The basic principle as above, the search engine spiders not to force the intelligence required to use Cookies, it will cause the page cannot be displayed so the mandatory use of Cookies can cause the search engine spiders can not normally access.

商河网络公司