Re: SOP for WebDNA talk - MSNBot Crashing

This WebDNA talk-list message is from

2004


It keeps the original formatting.
numero = 57812
interpreted = N
texte = I heard about web crawlers before but didn't investigate on them or implemented any changes on the code for them. At times WebDNA crashed, I saved some of netstat -a on a file. netstat -a > netstat.5-13-2004 I saved them as a reference for the times ../WebCatalogCtl restart or ./WebCatalogCtl stop/start produced another instance of WebDNA. LISTENING /tmp/.webcatalog LISTENING /tmp/.webcatalog Looking at them now I notice that a lot of connections from msnbot. I put robots.txt (to deny MSNBot) on all the sites except for one. In 30 mins, the site received 700 requests from msnbot. All I can say is there is a glitch on MSNBot. A web crawler should not cripple a site unless you put some code specifically for the web crawler (i.e. to increase hits). Eduardo ----- Original Message ----- From: "Alain Russell" To: "WebDNA Talk" Sent: Thursday, May 13, 2004 1:00 PM Subject: Re: SOP for WebDNA talk - MSNBot Crashing > Are you sure you're not redirecting the robot around the place .. So it > ends up bouncing from one page to another ? > We had a spider that went AWOL on our server once, we took about > 127,000 page requests in the space of an hour .. no crashing. > > Micro$osft are a pain the arse but I doubt the coders working on their > spider are stupid .. > > > On 14/05/2004, at 7:51 AM, wrote: > > > How about 10,000 page requests from MSNBot > > in about 3 hours. > > > > I created robots.txt > > ---------------- > > # MSNBot Search > > User-agent: msnbot > > Disallow: / > > ------------- > > > > on all the root directory of all our sites just to > > stop webDNA from crashing. > > > > I will refine the file later. Right now w/out this file > > its like MSNBot is doing a DoS on us. > > > > > > Eduardo > > > > ----- Original Message ----- > > From: "Alain Russell" > > To: "WebDNA Talk" > > Sent: Thursday, May 13, 2004 12:13 PM > > Subject: Re: SOP for WebDNA talk - MSNBot Crashing > > > > > >> I don't mean to cause trouble here but I can't remember the lsat time > >> I > >> saw WebDNA crash .. not even under heavy load .. > >> The first thing I would look for here is some bad code .. code that > >> expects a cart to be passed, eg - what happens if you call the page > >> with no cart= ? > >> > >> I just don't buy the webDNA crashing line anymore .. > >> > >> Alain > >> > >> > >> On 14/05/2004, at 6:59 AM, Paul Uttermohlen wrote: > >> > >>> Blocking the ranges of IP's used by the MSNBot servers via the > >>> firewall did > >>> the trick.... BUT the client wants the sites indexed by MSN. > >>> > >>> Gotta let them in... Gotta keep webcat from crashing a thousand times > >>> an > >>> hour... > >>> > >>> Paul > >>> > >>> > >>> > >>> > >>> On 5/13/04 2:09 PM, "Gary Krockover" wrote: > >>> > >>>> Can you do a redirect for MSNBots, something like: > >>>> > >>>> [showif [browsername]^MSNBOT] > >>>> [redirect (to a static sitemap that has no links & doesn't carry the > >>>> [cart])] > >>>> [/showif] > >>>> > >>>> Not sure if that would rectify the problem, just thinking out > >>>> loud.... > >>>> > >>>> GK > >>>> > >>>> At 11:37 AM 5/13/2004, you wrote: > >>>>> Scott, > >>>>> > >>>>> This looks like what I experienced last week. I tracked the cart > >>>>> problems > >>>>> down to MSNBot which was flooding the shopping cart folder with > >>>>> carts. Many > >>>>> of the carts had file names that where hundreds of characters long. > >>>>> That was > >>>>> partly due to the way I handle cart naming for repeat customers. > >>>>> > >>>>> MSNBot is beta. I think it is flawed. MSN insists that it's just > >>>>> sophisticated. And perhaps it is WebDNA that is flawed. > >>>>> > >>>>> There may be some truth to that. Large volumes of carts or long > >>>>> file > >>>>> names > >>>>> for carts should not cause Webcatalog to crash, but it does... And > >>>>> with > >>>>> great frequency and predictability. > >>>>> > >>>>> So Scott, if there is anything that you can do to stop this > >>>>> crashing > >>>>> behavior displayed when MSNBot floods our servers it would be > >>>>> greatly > >>>>> appreciated. > >>>>> > >>>>> Crashes occur on 4.5.1, 5.0.1 and 6 > >>>>> > >>>>> Thanks, Paul > >>>> > >>>> > >>>> ------------------------------------------------------------- > >>>> This message is sent to you because you are subscribed to > >>>> the mailing list . > >>>> To unsubscribe, E-mail to: > >>>> To switch to the DIGEST mode, E-mail to > >>>> > >>>> Web Archive of this list is at: http://webdna.smithmicro.com/ > >>> > >>> ________________________________________________________ > >>> Paul Uttermohlen > >>> http://www.Anoweb.com/ > >>> http://www.Uttermohlen.com/ > >>> Paul@Anoweb.com > >>> Columbus, Ohio 43026 > >>> 614-529-8963 > >>> _______________________________________________________ > >>> > >>> > >>> > >>> > >>> ------------------------------------------------------------- > >>> This message is sent to you because you are subscribed to > >>> the mailing list . > >>> To unsubscribe, E-mail to: > >>> To switch to the DIGEST mode, E-mail to > >>> > >>> Web Archive of this list is at: http://webdna.smithmicro.com/ > >>> > >> > >> > >> > >> ------------------------------------------------------------- > >> This message is sent to you because you are subscribed to > >> the mailing list . > >> To unsubscribe, E-mail to: > >> To switch to the DIGEST mode, E-mail to > > > >> Web Archive of this list is at: http://webdna.smithmicro.com/ > > > > > > ------------------------------------------------------------- > > This message is sent to you because you are subscribed to > > the mailing list . > > To unsubscribe, E-mail to: > > To switch to the DIGEST mode, E-mail to > > > > Web Archive of this list is at: http://webdna.smithmicro.com/ > > > > > > ------------------------------------------------------------- > This message is sent to you because you are subscribed to > the mailing list . > To unsubscribe, E-mail to: > To switch to the DIGEST mode, E-mail to > Web Archive of this list is at: http://webdna.smithmicro.com/ ------------------------------------------------------------- This message is sent to you because you are subscribed to the mailing list . To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail to Web Archive of this list is at: http://webdna.smithmicro.com/ Associated Messages, from the most recent to the oldest:

    
  1. Re: SOP for WebDNA talk - MSNBot Crashing ( Paul Uttermohlen 2004)
  2. Re: SOP for WebDNA talk - MSNBot Crashing ( 2004)
  3. Re: SOP for WebDNA talk - MSNBot Crashing ( "Scott Anderson" 2004)
  4. Re: SOP for WebDNA talk - MSNBot Crashing ( Frank Nordberg 2004)
  5. Re: SOP for WebDNA talk - MSNBot Crashing ( 2004)
  6. Re: SOP for WebDNA talk - MSNBot Crashing ( 2004)
  7. Re: SOP for WebDNA talk - MSNBot Crashing ( Paul Uttermohlen 2004)
  8. Re: SOP for WebDNA talk - MSNBot Crashing ( Donovan Brooke 2004)
  9. Re: SOP for WebDNA talk - MSNBot Crashing ( Donovan Brooke 2004)
  10. Re: SOP for WebDNA talk - MSNBot Crashing ( 2004)
  11. Re: SOP for WebDNA talk - MSNBot Crashing ( Alain Russell 2004)
  12. Re: SOP for WebDNA talk - MSNBot Crashing ( 2004)
  13. Re: SOP for WebDNA talk - MSNBot Crashing ( Alain Russell 2004)
  14. Re: SOP for WebDNA talk - MSNBot Crashing ( Paul Uttermohlen 2004)
  15. Re: SOP for WebDNA talk - MSNBot Crashing ( Gary Krockover 2004)
  16. Re: SOP for WebDNA talk - MSNBot Crashing ( Paul Uttermohlen 2004)
I heard about web crawlers before but didn't investigate on them or implemented any changes on the code for them. At times WebDNA crashed, I saved some of netstat -a on a file. netstat -a > netstat.5-13-2004 I saved them as a reference for the times ../WebCatalogCtl restart or ./WebCatalogCtl stop/start produced another instance of WebDNA. LISTENING /tmp/.webcatalog LISTENING /tmp/.webcatalog Looking at them now I notice that a lot of connections from msnbot. I put robots.txt (to deny MSNBot) on all the sites except for one. In 30 mins, the site received 700 requests from msnbot. All I can say is there is a glitch on MSNBot. A web crawler should not cripple a site unless you put some code specifically for the web crawler (i.e. to increase hits). Eduardo ----- Original Message ----- From: "Alain Russell" To: "WebDNA Talk" Sent: Thursday, May 13, 2004 1:00 PM Subject: Re: SOP for WebDNA talk - MSNBot Crashing > Are you sure you're not redirecting the robot around the place .. So it > ends up bouncing from one page to another ? > We had a spider that went AWOL on our server once, we took about > 127,000 page requests in the space of an hour .. no crashing. > > Micro$osft are a pain the arse but I doubt the coders working on their > spider are stupid .. > > > On 14/05/2004, at 7:51 AM, wrote: > > > How about 10,000 page requests from MSNBot > > in about 3 hours. > > > > I created robots.txt > > ---------------- > > # MSNBot Search > > User-agent: msnbot > > Disallow: / > > ------------- > > > > on all the root directory of all our sites just to > > stop webDNA from crashing. > > > > I will refine the file later. Right now w/out this file > > its like MSNBot is doing a DoS on us. > > > > > > Eduardo > > > > ----- Original Message ----- > > From: "Alain Russell" > > To: "WebDNA Talk" > > Sent: Thursday, May 13, 2004 12:13 PM > > Subject: Re: SOP for WebDNA talk - MSNBot Crashing > > > > > >> I don't mean to cause trouble here but I can't remember the lsat time > >> I > >> saw WebDNA crash .. not even under heavy load .. > >> The first thing I would look for here is some bad code .. code that > >> expects a cart to be passed, eg - what happens if you call the page > >> with no cart= ? > >> > >> I just don't buy the webDNA crashing line anymore .. > >> > >> Alain > >> > >> > >> On 14/05/2004, at 6:59 AM, Paul Uttermohlen wrote: > >> > >>> Blocking the ranges of IP's used by the MSNBot servers via the > >>> firewall did > >>> the trick.... BUT the client wants the sites indexed by MSN. > >>> > >>> Gotta let them in... Gotta keep webcat from crashing a thousand times > >>> an > >>> hour... > >>> > >>> Paul > >>> > >>> > >>> > >>> > >>> On 5/13/04 2:09 PM, "Gary Krockover" wrote: > >>> > >>>> Can you do a redirect for MSNBots, something like: > >>>> > >>>> [showif [browsername]^MSNBOT] > >>>> [redirect (to a static sitemap that has no links & doesn't carry the > >>>> [cart])] > >>>> [/showif] > >>>> > >>>> Not sure if that would rectify the problem, just thinking out > >>>> loud.... > >>>> > >>>> GK > >>>> > >>>> At 11:37 AM 5/13/2004, you wrote: > >>>>> Scott, > >>>>> > >>>>> This looks like what I experienced last week. I tracked the cart > >>>>> problems > >>>>> down to MSNBot which was flooding the shopping cart folder with > >>>>> carts. Many > >>>>> of the carts had file names that where hundreds of characters long. > >>>>> That was > >>>>> partly due to the way I handle cart naming for repeat customers. > >>>>> > >>>>> MSNBot is beta. I think it is flawed. MSN insists that it's just > >>>>> sophisticated. And perhaps it is WebDNA that is flawed. > >>>>> > >>>>> There may be some truth to that. Large volumes of carts or long > >>>>> file > >>>>> names > >>>>> for carts should not cause Webcatalog to crash, but it does... And > >>>>> with > >>>>> great frequency and predictability. > >>>>> > >>>>> So Scott, if there is anything that you can do to stop this > >>>>> crashing > >>>>> behavior displayed when MSNBot floods our servers it would be > >>>>> greatly > >>>>> appreciated. > >>>>> > >>>>> Crashes occur on 4.5.1, 5.0.1 and 6 > >>>>> > >>>>> Thanks, Paul > >>>> > >>>> > >>>> ------------------------------------------------------------- > >>>> This message is sent to you because you are subscribed to > >>>> the mailing list . > >>>> To unsubscribe, E-mail to: > >>>> To switch to the DIGEST mode, E-mail to > >>>> > >>>> Web Archive of this list is at: http://webdna.smithmicro.com/ > >>> > >>> ________________________________________________________ > >>> Paul Uttermohlen > >>> http://www.Anoweb.com/ > >>> http://www.Uttermohlen.com/ > >>> Paul@Anoweb.com > >>> Columbus, Ohio 43026 > >>> 614-529-8963 > >>> _______________________________________________________ > >>> > >>> > >>> > >>> > >>> ------------------------------------------------------------- > >>> This message is sent to you because you are subscribed to > >>> the mailing list . > >>> To unsubscribe, E-mail to: > >>> To switch to the DIGEST mode, E-mail to > >>> > >>> Web Archive of this list is at: http://webdna.smithmicro.com/ > >>> > >> > >> > >> > >> ------------------------------------------------------------- > >> This message is sent to you because you are subscribed to > >> the mailing list . > >> To unsubscribe, E-mail to: > >> To switch to the DIGEST mode, E-mail to > > > >> Web Archive of this list is at: http://webdna.smithmicro.com/ > > > > > > ------------------------------------------------------------- > > This message is sent to you because you are subscribed to > > the mailing list . > > To unsubscribe, E-mail to: > > To switch to the DIGEST mode, E-mail to > > > > Web Archive of this list is at: http://webdna.smithmicro.com/ > > > > > > ------------------------------------------------------------- > This message is sent to you because you are subscribed to > the mailing list . > To unsubscribe, E-mail to: > To switch to the DIGEST mode, E-mail to > Web Archive of this list is at: http://webdna.smithmicro.com/ ------------------------------------------------------------- This message is sent to you because you are subscribed to the mailing list . To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail to Web Archive of this list is at: http://webdna.smithmicro.com/

DOWNLOAD WEBDNA NOW!

Top Articles:

Talk List

The WebDNA community talk-list is the best place to get some help: several hundred extremely proficient programmers with an excellent knowledge of WebDNA and an excellent spirit will deliver all the tips and tricks you can imagine...

Related Readings:

Associative lookup style? + bit more (1997) WebCat2 as a chat server? (1997) Math problems (1998) URGENT: WebDNA Server Not Running (2003) [WebDNA] WebDNA as cgi app (was WebSite Examples) (2008) RE: Formulas.db + Users.db (1997) DON'T use old cart file! (1997) Country & Ship-to address & other fields ? (1997) [WebDNA] JSON objects and [JSONstore]: new cell (2014) HELP! Only 1/2 of pricing.db loading!?! (1999) WebMerchant when CC network is down (1998) [SearchString] problem with [search] context (1997) Trouble with Netscape (1998) Re: (1997) searching problem (1998) Fwd: Protect Tag and Groups (1998) [WebDNA] limit found per row (2011) [WebDNA] snow (2009) PC site chck pls (2003) Stumped on ShowNext -using variables (1997)