[WebDNA] More thoughts about [middle]
This WebDNA talk-list message is from 2015
It keeps the original formatting.
numero = 112035
interpreted = N
texte = Chris,This is an improvement over my previous suggestion for improvingthe [middle] context ... or at least that's how I see it:One way to give middle the ability to extract similar individualtags from a HTML page might be something like this:startAftercontinueUntilrepeatUntilvariableNameMy thought here is that middle would start after the firstmatching "startAfter" value, then it would continue from thereuntil it finds the next "continueUntil" value ...Then it would keep REPEATING the same "startAfter" and"continueUntil" procedure -- from the last place it found a match-- thus finding more matches until it has repeated "repeatUntil"times (which could be a positive whole number value or [end]) ...And every time it finds a match it would set a text variable tothe value found between the latest "startAfter" and"continueUntil" values.In other words, doing this:[middle startafter=
![]([!]
[/!]&continueUntil=)
[!][/!]&repeatUntil=[end][!][/!]&variableName=imagePath]
here's a span![](/images/abc.gif)
text here![](ghi.gif)
![](thumbnails/jkl.gif)
this is a paragraph
text here![](pqr.jpg)
this is a div
text here![](vwx.gif)
![](/logos/yz.jpg)
[/middle]... would result in setting these text variables:imagePath1 = /images/abc.gifimagePath2 = def.pngimagePath3 = ghi.gifimagePath4 = thumbnails/jkl.gifimagePath5 = mno.pngimagePath6 = pqr.jpgimagePath7 = stu.gifimagePath8 = vwx.gifimagePath9 = /logos/yz.jpgIn this case there would be no results displayed inside the middlecontext because the found values have been set as text variables. But if the "variableName" parameter were not used, those valueswould instead be displayed inside the middle context rather thanset as text vars.Something like "matchCase=T" might be a nice option too in case weneed to find exact lettercase matches.To me this is probably the best way to improve [middle] because itactually gives us the ability to find and extract similar HTMLtags that are repeated in page.Last week I wanted to write a script that checks my client'swebsite to see if the images referenced in each of his web pagesactually exists, but I stopped when I realized that WebDNA doesnot have a simple way to parse the HTML and extract the img tags. However, having the above capability would make an image-checkingscript a no-brainer.:)Regards,Kenneth GromeWebDNA Solutionshttp://www.webdnasolutions.comWeb Database Systems and Linux Server ManagementOn 01/23/2015 10:29 AM, David Bastedo wrote:> Thanks Ken & Tom,> > as soon as I understood what Ken was saying, I knew what I want to> do is impossible> > I literally want to pluck open graph or other meta data off of a> page, no matter where it is by just using its tag and an end point.> > If I know what tags I am looking for explicitly - I could put them> in a table and loop through looking for whatever I wanted, then I> could define the end - working "forward" from the opening of the> tag "og: title" for example, and end at the close of the tag "/"> and be able to pull out dynamically any meta tag I could possibly> think of.... or want.> > That would be pretty straight forward and very powerful.> I can accomplish this task by creating a one off relationship> between a page and its tags - say for twitter - its an easy way to> grab an image - but its not dynamic I want to do this for any type> of page.> > d.> > On Fri, Jan 23, 2015 at 7:35 AM, Tom Duke
> wrote:> > David,> > Hi - you won't be able to achieve what you are trying to do> with [middle]. You might be able to hack something together> using [grep] or [listwords]. Though Stackoverflow is full of> articles outlining why regex should not be used to parse HTML.> (http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454)> > Your example shows why a proper HTML parser within WebDNA> would be really useful. For example if you paste your code> into this page:> > http://try.jsoup.org> > and type "meta" into the CSS Query box you'll see how a HTML> parser does the job.> > - Tom> > > > > > > ==============================================> Digital Revolutionaries> 1st Floor, Castleriver House> 14-15 Parliament Street> Temple Bar,Dublin 2> Ireland> ----------------------------------------------> [t]: + 353 1 4403907 > [e]: >> [w]: > ==============================================> > On 23 January 2015 at 00:11, David Bastedo > wrote:> > To your point, I never switched out your test variable> properly> To my point, I hate when you are right.> I get the same results.> > However, as opposed to blaming me for not understanding> how the friggin thing works, the docs aren't very clear> and after seeing your example I now understand "backwards"> for the reality that it is.> > There is no hope in hell of doing what I want with middle.> > Your first example is not as good as your second example> to illustrate the concept. Thank you for taking the time> with the second example, it illustrate backwards much more> effectively.> > d.> > > ---------------------------------------------------------> This message is sent to you because you are subscribed to> the mailing list __. To unsubscribe, E-mail to: __> archives: http://mail.webdna.us/list/talk@webdna.us Bug> Reporting: support@webdna.us > > > --------------------------------------------------------- This> message is sent to you because you are subscribed to the> mailing list __. To unsubscribe, E-mail to: __ archives:> http://mail.webdna.us/list/talk@webdna.us Bug Reporting:> support@webdna.us > > > > > -- > David Bastedo> > Ten Plus One Communications Inc.> http://www.10plus1.com> 416.277.4499> > --------------------------------------------------------- This> message is sent to you because you are subscribed to the mailing> list . To unsubscribe, E-mail to: archives:> http://mail.webdna.us/list/talk@webdna.us Bug Reporting:> support@webdna.us
Associated Messages, from the most recent to the oldest:
|
- [WebDNA] More thoughts about [middle] (Kenneth Grome 2015)
|
Chris,This is an improvement over my previous suggestion for improvingthe [middle] context ... or at least that's how I see it:One way to give middle the ability to extract similar individualtags from a HTML page might be something like this:startAftercontinueUntilrepeatUntilvariableNameMy thought here is that middle would start after the firstmatching "startAfter" value, then it would continue from thereuntil it finds the next "continueUntil" value ...Then it would keep REPEATING the same "startAfter" and"continueUntil" procedure -- from the last place it found a match-- thus finding more matches until it has repeated "repeatUntil"times (which could be a positive whole number value or [end]) ...And every time it finds a match it would set a text variable tothe value found between the latest "startAfter" and"continueUntil" values.In other words, doing this:[middle startafter=
[!][/!]&continueUntil=>[!][/!]&repeatUntil=[end][!][/!]&variableName=imagePath]here's a span![](/images/abc.gif)
text here![](ghi.gif)
![](thumbnails/jkl.gif)
this is a paragraph
text here![](pqr.jpg)
this is a div
text here![](vwx.gif)
[/middle]... would result in setting these text variables:imagePath1 = /images/abc.gifimagePath2 = def.pngimagePath3 = ghi.gifimagePath4 = thumbnails/jkl.gifimagePath5 = mno.pngimagePath6 = pqr.jpgimagePath7 = stu.gifimagePath8 = vwx.gifimagePath9 = /logos/yz.jpgIn this case there would be no results displayed inside the middlecontext because the found values have been set as text variables. But if the "variableName" parameter were not used, those valueswould instead be displayed inside the middle context rather thanset as text vars.Something like "matchCase=T" might be a nice option too in case weneed to find exact lettercase matches.To me this is probably the best way to improve [middle] because itactually gives us the ability to find and extract similar HTMLtags that are repeated in page.Last week I wanted to write a script that checks my client'swebsite to see if the images referenced in each of his web pagesactually exists, but I stopped when I realized that WebDNA doesnot have a simple way to parse the HTML and extract the img tags. However, having the above capability would make an image-checkingscript a no-brainer.:)Regards,Kenneth GromeWebDNA Solutionshttp://www.webdnasolutions.comWeb Database Systems and Linux Server ManagementOn 01/23/2015 10:29 AM, David Bastedo wrote:> Thanks Ken & Tom,> > as soon as I understood what Ken was saying, I knew what I want to> do is impossible> > I literally want to pluck open graph or other meta data off of a> page, no matter where it is by just using its tag and an end point.> > If I know what tags I am looking for explicitly - I could put them> in a table and loop through looking for whatever I wanted, then I> could define the end - working "forward" from the opening of the> tag "og: title" for example, and end at the close of the tag "/"> and be able to pull out dynamically any meta tag I could possibly> think of.... or want.> > That would be pretty straight forward and very powerful.> I can accomplish this task by creating a one off relationship> between a page and its tags - say for twitter - its an easy way to> grab an image - but its not dynamic I want to do this for any type> of page.> > d.> > On Fri, Jan 23, 2015 at 7:35 AM, Tom Duke > wrote:> > David,> > Hi - you won't be able to achieve what you are trying to do> with [middle]. You might be able to hack something together> using [grep] or [listwords]. Though Stackoverflow is full of> articles outlining why regex should not be used to parse HTML.> (http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454)> > Your example shows why a proper HTML parser within WebDNA> would be really useful. For example if you paste your code> into this page:> > http://try.jsoup.org> > and type "meta" into the CSS Query box you'll see how a HTML> parser does the job.> > - Tom> > > > > > > ==============================================> Digital Revolutionaries> 1st Floor, Castleriver House> 14-15 Parliament Street> Temple Bar,Dublin 2> Ireland> ----------------------------------------------> [t]: + 353 1 4403907 > [e]: >> [w]: > ==============================================> > On 23 January 2015 at 00:11, David Bastedo > wrote:> > To your point, I never switched out your test variable> properly> To my point, I hate when you are right.> I get the same results.> > However, as opposed to blaming me for not understanding> how the friggin thing works, the docs aren't very clear> and after seeing your example I now understand "backwards"> for the reality that it is.> > There is no hope in hell of doing what I want with middle.> > Your first example is not as good as your second example> to illustrate the concept. Thank you for taking the time> with the second example, it illustrate backwards much more> effectively.> > d.> > > ---------------------------------------------------------> This message is sent to you because you are subscribed to> the mailing list __. To unsubscribe, E-mail to: __> archives: http://mail.webdna.us/list/talk@webdna.us Bug> Reporting: support@webdna.us > > > --------------------------------------------------------- This> message is sent to you because you are subscribed to the> mailing list __. To unsubscribe, E-mail to: __ archives:> http://mail.webdna.us/list/talk@webdna.us Bug Reporting:> support@webdna.us > > > > > -- > David Bastedo> > Ten Plus One Communications Inc.> http://www.10plus1.com> 416.277.4499> > --------------------------------------------------------- This> message is sent to you because you are subscribed to the mailing> list . To unsubscribe, E-mail to: archives:> http://mail.webdna.us/list/talk@webdna.us Bug Reporting:> support@webdna.us
Kenneth Grome
DOWNLOAD WEBDNA NOW!
Top Articles:
Talk List
The WebDNA community talk-list is the best place to get some help: several hundred extremely proficient programmers with an excellent knowledge of WebDNA and an excellent spirit will deliver all the tips and tricks you can imagine...
Related Readings:
BR (1997)
Re[3]: 2nd WebCatalog2 Feature Request (1996)
Am I going senile? (Price recalc based on quantity) (1997)
Sort Order on a page search (1997)
Calculating days, hours, minutes ago (2004)
Need help!! on searching in two databases. (1998)
[shownext] support - MacOS (1997)
Newbie needs advice to learn to use WebDNA (2003)
Verifying both name and password (was: New Problem) (1997)
WebDNA 4.5.1 Now Available (2003)
Help name our technology! I found it (1997)
won't serve .tpl (2000)
popups, netscape vs explorer (1997)
Secured Order Forms (1998)
Multiple Ad databases? (1997)
Sort Order on a page search (1997)
embedding [showcart] ??? (1998)
This might sound silly... (2000)
redirect with frames (1997)
Used to be good 4.5 to 6 code change (2004)