Web Content Mining with Java: Techniques for Exploiting the World Wide Web
|
| List Price: | £33.99 |
| Price: | £28.89 & eligible for FREE Super Saver Delivery on orders over £5. Details |
Availability: Usually dispatched within 24 hours
Dispatched from and sold by Amazon.co.uk
19 new or used available from £12.24
Average customer review:Product Description
Unlock the potential of the world′s biggest database.
This practical book shows you how to build portals, construct search engines and other knowledge–based applications to mine the information you need from the Web.
∗ Written by a developer for developers
∗ A practical, hands–on approach
∗ Illustrates how Java associated tools (XML, HTML) can be combined with database technology to display and manipulate Web–derived information more effectively.
∗ Demonstrates how to build a structure browser, portal, meta–search engine and how to make ′Talking Pages′
Product Details
- Amazon Sales Rank: #921621 in Books
- Published on: 2002-03-28
- Original language: English
- Number of items: 1
- Binding: Paperback
- 320 pages
Editorial Reviews
Review
"When I got this book, I couldn′t put it down. A lot of computer books sit on the shelf or send me to sleep, but not this one. Not only is it both topical and useful, but it hits a just–about–ideal balance between code and food for thought. The author has a real knack for useful solutions to complex problems." (www. Java Ranch 17 May 2002)
Review
"When I got this book, I couldn′t put it down. A lot of computer books sit on the shelf or send me to sleep, but not this one. Not only is it both topical and useful, but it hits a just–about–ideal balance between code and food for thought. The author has a real knack for useful solutions to complex problems." (www. Java Ranch 17 May 2002)
From the Back Cover
What do you with information at the websites you visit? You read it, print it, and maybe do a screen grab. But you could do so much more with it if only you could get hold of the information in a more usable form: a form that you could manipulate, store and query automatically.
In this book you′ll learn how to automate the:
∗ discovery of websites containing interesting data
∗ extraction of specific information from HTML and XML pages
∗ presentation of aggregate information via your own portal
∗ interpretation of data using text– and data–mining techniques
Java is the language of the web, so all practical examples are provided in the form of Java code that demonstrates HTTP communication, HTML and XML parsing, email retrieval and much more.
This is the book for you if you want some real, practical, help to get your Java–based information applications off the ground.
Customer Reviews
Interesting material on how to extract data from the Web.
Tony came up with an ingenious way to parse HTML and convert the DOM model of an HTML page into a string which can then be then be queried.
He provides a tool to view the strings which can then be selected and use wildcards to bring back similar sets of data. You end up being able to create a SQL like syntax to pull out the data. Cool stuff.
The portal stuff looked a bit dated, but the book had a few of extra bits that I wish more authors would follow suite and do:
1) Include imports statements. Which is a particular bug bear I have with so many APress books.
2) Provide a summary at the end of the chapters briefly describing the API's
3) Include comments at the top of the source code to indicate the name of the artifact and where it is in the source directory. I'd always thought I do this if ever I get around to writing a book myself. It's the first book I've read that does it. A man after my own heart. :)
