Re: SOP for WebDNA talk - MSNBot Crashing
This WebDNA talk-list message is from 2004
It keeps the original formatting.
numero = 57812
interpreted = N
texte = I heard about web crawlers before but didn't investigateon them or implemented any changes on the code for them.At times WebDNA crashed, I saved some of netstat -a on a file.netstat -a > netstat.5-13-2004I saved them as a reference for the times../WebCatalogCtl restart or ./WebCatalogCtl stop/startproduced another instance of WebDNA.LISTENING /tmp/.webcatalogLISTENING /tmp/.webcatalogLooking at them now I notice that a lot of connections frommsnbot.I put robots.txt (to deny MSNBot) on all the sites except for one.In 30 mins, the site received 700 requests from msnbot.All I can say is there is a glitch on MSNBot. A web crawler shouldnot cripple a site unless you put some code specifically for the webcrawler (i.e. to increase hits).Eduardo----- Original Message -----From: "Alain Russell"
To: "WebDNA Talk" Sent: Thursday, May 13, 2004 1:00 PMSubject: Re: SOP for WebDNA talk - MSNBot Crashing> Are you sure you're not redirecting the robot around the place .. So it> ends up bouncing from one page to another ?> We had a spider that went AWOL on our server once, we took about> 127,000 page requests in the space of an hour .. no crashing.>> Micro$osft are a pain the arse but I doubt the coders working on their> spider are stupid ..>>> On 14/05/2004, at 7:51 AM, wrote:>> > How about 10,000 page requests from MSNBot> > in about 3 hours.> >> > I created robots.txt> > ----------------> > # MSNBot Search> > User-agent: msnbot> > Disallow: /> > -------------> >> > on all the root directory of all our sites just to> > stop webDNA from crashing.> >> > I will refine the file later. Right now w/out this file> > its like MSNBot is doing a DoS on us.> >> >> > Eduardo> >> > ----- Original Message -----> > From: "Alain Russell" > > To: "WebDNA Talk" > > Sent: Thursday, May 13, 2004 12:13 PM> > Subject: Re: SOP for WebDNA talk - MSNBot Crashing> >> >> >> I don't mean to cause trouble here but I can't remember the lsat time> >> I> >> saw WebDNA crash .. not even under heavy load ..> >> The first thing I would look for here is some bad code .. code that> >> expects a cart to be passed, eg - what happens if you call the page> >> with no cart= ?> >>> >> I just don't buy the webDNA crashing line anymore ..> >>> >> Alain> >>> >>> >> On 14/05/2004, at 6:59 AM, Paul Uttermohlen wrote:> >>> >>> Blocking the ranges of IP's used by the MSNBot servers via the> >>> firewall did> >>> the trick.... BUT the client wants the sites indexed by MSN.> >>>> >>> Gotta let them in... Gotta keep webcat from crashing a thousand times> >>> an> >>> hour...> >>>> >>> Paul> >>>> >>>> >>>> >>>> >>> On 5/13/04 2:09 PM, "Gary Krockover" wrote:> >>>> >>>> Can you do a redirect for MSNBots, something like:> >>>>> >>>> [showif [browsername]^MSNBOT]> >>>> [redirect (to a static sitemap that has no links & doesn't carry the> >>>> [cart])]> >>>> [/showif]> >>>>> >>>> Not sure if that would rectify the problem, just thinking out> >>>> loud....> >>>>> >>>> GK> >>>>> >>>> At 11:37 AM 5/13/2004, you wrote:> >>>>> Scott,> >>>>>> >>>>> This looks like what I experienced last week. I tracked the cart> >>>>> problems> >>>>> down to MSNBot which was flooding the shopping cart folder with> >>>>> carts. Many> >>>>> of the carts had file names that where hundreds of characters long.> >>>>> That was> >>>>> partly due to the way I handle cart naming for repeat customers.> >>>>>> >>>>> MSNBot is beta. I think it is flawed. MSN insists that it's just> >>>>> sophisticated. And perhaps it is WebDNA that is flawed.> >>>>>> >>>>> There may be some truth to that. Large volumes of carts or long> >>>>> file> >>>>> names> >>>>> for carts should not cause Webcatalog to crash, but it does... And> >>>>> with> >>>>> great frequency and predictability.> >>>>>> >>>>> So Scott, if there is anything that you can do to stop this> >>>>> crashing> >>>>> behavior displayed when MSNBot floods our servers it would be> >>>>> greatly> >>>>> appreciated.> >>>>>> >>>>> Crashes occur on 4.5.1, 5.0.1 and 6> >>>>>> >>>>> Thanks, Paul> >>>>> >>>>> >>>> -------------------------------------------------------------> >>>> This message is sent to you because you are subscribed to> >>>> the mailing list .> >>>> To unsubscribe, E-mail to: > >>>> To switch to the DIGEST mode, E-mail to> >>>> > >>>> Web Archive of this list is at: http://webdna.smithmicro.com/> >>>> >>> ________________________________________________________> >>> Paul Uttermohlen> >>> http://www.Anoweb.com/> >>> http://www.Uttermohlen.com/> >>> Paul@Anoweb.com> >>> Columbus, Ohio 43026> >>> 614-529-8963> >>> _______________________________________________________> >>>> >>>> >>>> >>>> >>> -------------------------------------------------------------> >>> This message is sent to you because you are subscribed to> >>> the mailing list .> >>> To unsubscribe, E-mail to: > >>> To switch to the DIGEST mode, E-mail to> >>> > >>> Web Archive of this list is at: http://webdna.smithmicro.com/> >>>> >>> >>> >>> >> -------------------------------------------------------------> >> This message is sent to you because you are subscribed to> >> the mailing list .> >> To unsubscribe, E-mail to: > >> To switch to the DIGEST mode, E-mail to> > > >> Web Archive of this list is at: http://webdna.smithmicro.com/> >> >> > -------------------------------------------------------------> > This message is sent to you because you are subscribed to> > the mailing list .> > To unsubscribe, E-mail to: > > To switch to the DIGEST mode, E-mail to> > > > Web Archive of this list is at: http://webdna.smithmicro.com/> >>>>> -------------------------------------------------------------> This message is sent to you because you are subscribed to> the mailing list .> To unsubscribe, E-mail to: > To switch to the DIGEST mode, E-mail to> Web Archive of this list is at: http://webdna.smithmicro.com/-------------------------------------------------------------This message is sent to you because you are subscribed to the mailing list .To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail to Web Archive of this list is at: http://webdna.smithmicro.com/
Associated Messages, from the most recent to the oldest:
I heard about web crawlers before but didn't investigateon them or implemented any changes on the code for them.At times WebDNA crashed, I saved some of netstat -a on a file.netstat -a > netstat.5-13-2004I saved them as a reference for the times../WebCatalogCtl restart or ./WebCatalogCtl stop/startproduced another instance of WebDNA.LISTENING /tmp/.webcatalogLISTENING /tmp/.webcatalogLooking at them now I notice that a lot of connections frommsnbot.I put robots.txt (to deny MSNBot) on all the sites except for one.In 30 mins, the site received 700 requests from msnbot.All I can say is there is a glitch on MSNBot. A web crawler shouldnot cripple a site unless you put some code specifically for the webcrawler (i.e. to increase hits).Eduardo----- Original Message -----From: "Alain Russell" To: "WebDNA Talk" Sent: Thursday, May 13, 2004 1:00 PMSubject: Re: SOP for WebDNA talk - MSNBot Crashing> Are you sure you're not redirecting the robot around the place .. So it> ends up bouncing from one page to another ?> We had a spider that went AWOL on our server once, we took about> 127,000 page requests in the space of an hour .. no crashing.>> Micro$osft are a pain the arse but I doubt the coders working on their> spider are stupid ..>>> On 14/05/2004, at 7:51 AM, wrote:>> > How about 10,000 page requests from MSNBot> > in about 3 hours.> >> > I created robots.txt> > ----------------> > # MSNBot Search> > User-agent: msnbot> > Disallow: /> > -------------> >> > on all the root directory of all our sites just to> > stop webDNA from crashing.> >> > I will refine the file later. Right now w/out this file> > its like MSNBot is doing a DoS on us.> >> >> > Eduardo> >> > ----- Original Message -----> > From: "Alain Russell" > > To: "WebDNA Talk" > > Sent: Thursday, May 13, 2004 12:13 PM> > Subject: Re: SOP for WebDNA talk - MSNBot Crashing> >> >> >> I don't mean to cause trouble here but I can't remember the lsat time> >> I> >> saw WebDNA crash .. not even under heavy load ..> >> The first thing I would look for here is some bad code .. code that> >> expects a cart to be passed, eg - what happens if you call the page> >> with no cart= ?> >>> >> I just don't buy the webDNA crashing line anymore ..> >>> >> Alain> >>> >>> >> On 14/05/2004, at 6:59 AM, Paul Uttermohlen wrote:> >>> >>> Blocking the ranges of IP's used by the MSNBot servers via the> >>> firewall did> >>> the trick.... BUT the client wants the sites indexed by MSN.> >>>> >>> Gotta let them in... Gotta keep webcat from crashing a thousand times> >>> an> >>> hour...> >>>> >>> Paul> >>>> >>>> >>>> >>>> >>> On 5/13/04 2:09 PM, "Gary Krockover" wrote:> >>>> >>>> Can you do a redirect for MSNBots, something like:> >>>>> >>>> [showif [browsername]^MSNBOT]> >>>> [redirect (to a static sitemap that has no links & doesn't carry the> >>>> [cart])]> >>>> [/showif]> >>>>> >>>> Not sure if that would rectify the problem, just thinking out> >>>> loud....> >>>>> >>>> GK> >>>>> >>>> At 11:37 AM 5/13/2004, you wrote:> >>>>> Scott,> >>>>>> >>>>> This looks like what I experienced last week. I tracked the cart> >>>>> problems> >>>>> down to MSNBot which was flooding the shopping cart folder with> >>>>> carts. Many> >>>>> of the carts had file names that where hundreds of characters long.> >>>>> That was> >>>>> partly due to the way I handle cart naming for repeat customers.> >>>>>> >>>>> MSNBot is beta. I think it is flawed. MSN insists that it's just> >>>>> sophisticated. And perhaps it is WebDNA that is flawed.> >>>>>> >>>>> There may be some truth to that. Large volumes of carts or long> >>>>> file> >>>>> names> >>>>> for carts should not cause Webcatalog to crash, but it does... And> >>>>> with> >>>>> great frequency and predictability.> >>>>>> >>>>> So Scott, if there is anything that you can do to stop this> >>>>> crashing> >>>>> behavior displayed when MSNBot floods our servers it would be> >>>>> greatly> >>>>> appreciated.> >>>>>> >>>>> Crashes occur on 4.5.1, 5.0.1 and 6> >>>>>> >>>>> Thanks, Paul> >>>>> >>>>> >>>> -------------------------------------------------------------> >>>> This message is sent to you because you are subscribed to> >>>> the mailing list .> >>>> To unsubscribe, E-mail to: > >>>> To switch to the DIGEST mode, E-mail to> >>>> > >>>> Web Archive of this list is at: http://webdna.smithmicro.com/> >>>> >>> ________________________________________________________> >>> Paul Uttermohlen> >>> http://www.Anoweb.com/> >>> http://www.Uttermohlen.com/> >>> Paul@Anoweb.com> >>> Columbus, Ohio 43026> >>> 614-529-8963> >>> _______________________________________________________> >>>> >>>> >>>> >>>> >>> -------------------------------------------------------------> >>> This message is sent to you because you are subscribed to> >>> the mailing list .> >>> To unsubscribe, E-mail to: > >>> To switch to the DIGEST mode, E-mail to> >>> > >>> Web Archive of this list is at: http://webdna.smithmicro.com/> >>>> >>> >>> >>> >> -------------------------------------------------------------> >> This message is sent to you because you are subscribed to> >> the mailing list .> >> To unsubscribe, E-mail to: > >> To switch to the DIGEST mode, E-mail to> > > >> Web Archive of this list is at: http://webdna.smithmicro.com/> >> >> > -------------------------------------------------------------> > This message is sent to you because you are subscribed to> > the mailing list .> > To unsubscribe, E-mail to: > > To switch to the DIGEST mode, E-mail to> > > > Web Archive of this list is at: http://webdna.smithmicro.com/> >>>>> -------------------------------------------------------------> This message is sent to you because you are subscribed to> the mailing list .> To unsubscribe, E-mail to: > To switch to the DIGEST mode, E-mail to> Web Archive of this list is at: http://webdna.smithmicro.com/-------------------------------------------------------------This message is sent to you because you are subscribed to the mailing list .To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail to Web Archive of this list is at: http://webdna.smithmicro.com/
DOWNLOAD WEBDNA NOW!
Top Articles:
Talk List
The WebDNA community talk-list is the best place to get some help: several hundred extremely proficient programmers with an excellent knowledge of WebDNA and an excellent spirit will deliver all the tips and tricks you can imagine...
Related Readings:
Associative lookup style? + bit more (1997)
WebCat2 as a chat server? (1997)
Math problems (1998)
URGENT: WebDNA Server Not Running (2003)
[WebDNA] WebDNA as cgi app (was WebSite Examples) (2008)
RE: Formulas.db + Users.db (1997)
DON'T use old cart file! (1997)
Country & Ship-to address & other fields ? (1997)
[WebDNA] JSON objects and [JSONstore]: new cell (2014)
HELP! Only 1/2 of pricing.db loading!?! (1999)
WebMerchant when CC network is down (1998)
[SearchString] problem with [search] context (1997)
Trouble with Netscape (1998)
Re: (1997)
searching problem (1998)
Fwd: Protect Tag and Groups (1998)
[WebDNA] limit found per row (2011)
[WebDNA] snow (2009)
PC site chck pls (2003)
Stumped on ShowNext -using variables (1997)