Re: [WebDNA] END processing

This WebDNA talk-list message is from

2014


It keeps the original formatting.
numero = 111311
interpreted = N
texte = Matt That=92s a good point, and something anyone thinking of setting up a = system like this should consider. I have a separate permission table = that allows unlimited page views of good bots by ip range, 404=92s = anonymous proxy services, and spams bad bots by name or IP range. After = that I count page views by what I believe to be humans and limit at a = number of page views. You would not believe what some people will do. I = log all search requests, and have watched people try to sequentially = search OEM part numbers and model numbers to build their cross = references off of ours. DENIED! FWIW, don=92t trust user agent strings. People (and computers) lie.=20 -Brian B. On Apr 23, 2014, at 4:52 AM, Psi Prime Inc, Matthew A Perosi = wrote: > Just out of curiosity, have you thought of how this will effect the = Googlebot spider? This sounds like the classic definition of cloaking = to me. >=20 > You might want to add an extra test in your [include] that looks up = the user agent and allows msnbot bingbot, yahoobot, and googlebot to = access all the content regardless of your counter. >=20 > I'm all for zapping competitors taking your info, but don't = accidentally zap yourself from Google search in the process. >=20 > -Matt >=20 > On 4/22/2014 12:17 PM, Brian Burton wrote: >> So here=92s a weird situation. >>=20 >> I have a website in the replacement parts business that has extensive = cross reference info on it. It requires a full time employee to maintain = the data. A lot of competitors, trying to save a buck, prefer to copy = our data rather then do the research themselves. So I developed code = that counts the number of page views and after a point cuts off access = to the site. As part of an effort to both amuse myself and be =93helpful=94= to competitors that send automated spiders to steal the website, when = the cutoff happens I start feeding bogus data as =93valid=94 pages to = them. :) This all happens as part of an include file at the top of = every page that logs, counts, and issues the redirect to the bogus URL = if needed. >>=20 >> So here=92s my next challenge: I don=92t want to redirect the bad = visitors. The redirect is noticeable, and the new URL gives away that = it=92s fake data. I want to (via an include file) build up a page of = fake data on the URL they requested, and END all further processing on = the page of the legitimate stuff that would happen below the include. = Building a hideif would be troublesome due to the complexity of the code = on all the pages that this would have to happen on. >>=20 >> Thoughts? suggestions? >>=20 >> Thanks! >> Brian Associated Messages, from the most recent to the oldest:

    
  1. Re: [WebDNA] END processing (Brian Burton 2014)
  2. Re: [WebDNA] END processing ("Psi Prime Inc, Matthew A Perosi " 2014)
  3. Re: [WebDNA] END processing (Brian Burton 2014)
  4. Re: [WebDNA] END processing (Tom Duke 2014)
  5. [WebDNA] END processing (Brian Burton 2014)
Matt That=92s a good point, and something anyone thinking of setting up a = system like this should consider. I have a separate permission table = that allows unlimited page views of good bots by ip range, 404=92s = anonymous proxy services, and spams bad bots by name or IP range. After = that I count page views by what I believe to be humans and limit at a = number of page views. You would not believe what some people will do. I = log all search requests, and have watched people try to sequentially = search OEM part numbers and model numbers to build their cross = references off of ours. DENIED! FWIW, don=92t trust user agent strings. People (and computers) lie.=20 -Brian B. On Apr 23, 2014, at 4:52 AM, Psi Prime Inc, Matthew A Perosi = wrote: > Just out of curiosity, have you thought of how this will effect the = Googlebot spider? This sounds like the classic definition of cloaking = to me. >=20 > You might want to add an extra test in your [include] that looks up = the user agent and allows msnbot bingbot, yahoobot, and googlebot to = access all the content regardless of your counter. >=20 > I'm all for zapping competitors taking your info, but don't = accidentally zap yourself from Google search in the process. >=20 > -Matt >=20 > On 4/22/2014 12:17 PM, Brian Burton wrote: >> So here=92s a weird situation. >>=20 >> I have a website in the replacement parts business that has extensive = cross reference info on it. It requires a full time employee to maintain = the data. A lot of competitors, trying to save a buck, prefer to copy = our data rather then do the research themselves. So I developed code = that counts the number of page views and after a point cuts off access = to the site. As part of an effort to both amuse myself and be =93helpful=94= to competitors that send automated spiders to steal the website, when = the cutoff happens I start feeding bogus data as =93valid=94 pages to = them. :) This all happens as part of an include file at the top of = every page that logs, counts, and issues the redirect to the bogus URL = if needed. >>=20 >> So here=92s my next challenge: I don=92t want to redirect the bad = visitors. The redirect is noticeable, and the new URL gives away that = it=92s fake data. I want to (via an include file) build up a page of = fake data on the URL they requested, and END all further processing on = the page of the legitimate stuff that would happen below the include. = Building a hideif would be troublesome due to the complexity of the code = on all the pages that this would have to happen on. >>=20 >> Thoughts? suggestions? >>=20 >> Thanks! >> Brian Brian Burton

DOWNLOAD WEBDNA NOW!

Top Articles:

Talk List

The WebDNA community talk-list is the best place to get some help: several hundred extremely proficient programmers with an excellent knowledge of WebDNA and an excellent spirit will deliver all the tips and tricks you can imagine...

Related Readings:

Possible Macv2.1b2 Merge Bug (1997) Single Link browsing (1997) [addlineitems] (1997) [OT] HTML EMAIL program wanted (1999) Options for http uploading of files (1999) Product Name in AdminResults.inc (2001) Date Calulation (1997) Bug or syntax error on my part? (1997) Sub Totals (2000) Summing fields (1997) Hello??? (1997) WebCatalog for Postcards ? (1997) WebObjects (1998) Server crash (1997) Field names beginning with Reg.. (2002) How long until WebDNA makes the list? :( (2004) formatting dates from a field ... (1997) Location of Webcat site in folder hierarchy (1997) More Applescript (1997) Advanced WebCat Guidelines - do they exist beyond docs (1999)