• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Creating HTML Parser

 
Ranch Hand
Posts: 96
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
As a my final year project i created Web Browser, but with the help of third party parser, so as a part of further development i want to write a HTML parser in java.. But i don't know how to proceed.. Please help me with this.
 
Bartender
Posts: 11497
19
Android Google Web Toolkit Mac Eclipse IDE Ubuntu Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Have you listed all the features your parser should have?
 
Sheriff
Posts: 22781
131
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'd start by checking some existing open source Java HTML parsers. If you can't use them directly in your code, you can at least check out how they've done it.
 
Maneesh Godbole
Bartender
Posts: 11497
19
Android Google Web Toolkit Mac Eclipse IDE Ubuntu Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Um..wouldn't that defeat the purpose of writing your own parser?
 
Rob Spoor
Sheriff
Posts: 22781
131
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'm not saying the entire code should be copy-pasted, but it could be used for hints on how to do it. For something like this, I wouldn't completely reinvent the wheel, not even as part of a school project.
 
harshada patil
Ranch Hand
Posts: 96
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I want to try developing my own parser, HTML parser with minimum functionality and then moving towards advanced functionality.. but i'm not getting how i can proceed, or what first move i should take
 
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
First of all you need to know how parsers work in general. Next you need to decide whether you want to do strict parsing (i.e. reject anything which doesn't conform to the HTML spec) or lenient parsing (i.e. accept anything which vaguely resembles HTML). Then you need a grammar for whatever you decided there. Finally you need to write a parser based on that grammar.
 
harshada patil
Ranch Hand
Posts: 96
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks Paul..

Please suggest me some material regarding how parser work in general, because i searched for it and it is hard to get it..
 
Paul Clapham
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Am I correct in guessing that you know approximately nothing about parsers? Then start with the Wikipedia article: Parser.
 
harshada patil
Ranch Hand
Posts: 96
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'm an software engineer, and i know about parser ( concept i learned from compiler construction course), as a theoretical part, i know how to create grammar too, but i never tried for designing parser before so..
 
Paul Clapham
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Okay, then it shouldn't be a problem. Just be aware that it's going to be a significant amount of work, so asking vague and general questions on forums is unlikely to advance that process.
 
I AM MIGHTY! Especially when I hold this tiny ad:
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic