• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Best Java API for balancing HTML tags

 
Ajay Dhar
Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Does anyone know of a really good API for balancing HTML tags? Say for example I have the following HTML snippet:

Feedback control is the basic mechanism by which systems, whether mechanical, electrical, or biological, maintain their equilibrium or homeostasis. In the higher life forms, the conditions under which life can continue are quite narrow. A change in body temperature of half a degree is generally a sign of illness. The homeostasis of the body is maintained through the use of feedback control [Wiener 1948]. A primary contribution of C.R. Darwin during the last century was the theory that feedback over long time periods is responsible for the evolution of species. In 1931 V. Volterra explained the balance between two populations of fish in a closed pond using the theory of feedback.</P>
Feedback control may be defined as the use of difference signals, determined by comparing the actual values of system variables to their desired values, as a means of controlling a system. An everyday example of a feedback control system is an automobile speed control, which uses the difference between the actual and the desired speed to vary the fuel flow rate. Since the system output is used to regulate its input, such a device is said to be a <em>closed-loop control system</em>.</P>
In this book we shall show how to use <em>modern control theory</em> to design feedback control systems. Thus, we are concerned not with natural control systems, such as those that occur in living organisms or in society, but with man-made control systems such as those used to control aircraft, automobilies, satellites, robots, and industrial processes.</P>
Realizing that the best way to understand an area is to examine its evolution and the reasons for its existence, we shall first provide a short history of automatic control theory. Then, we give a brief discussion of the philosophies of classical and modern control theory.</P>
The references for Chapter 1 are at the end of this chapter. The references for the remainder of the book appear at the end of the book.</P>

The paragraphs are followed by closing </p> tags but none of the paragraphs start with an open <p> tag. I'm looking for an API to balance the <p> tags accurately so that I can convert the HTML page to XML and extract the paragraphs with XPATH.

Thanks,
Ajay
 
Tim Moores
Bartender
Posts: 2895
46
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Check out NekoHtml, JTidy, TagSoup and HtmlCleaner.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic