Issues in Monitoring Web Data, Proc. DEXA, 2002. ,
DOI : 10.1007/3-540-46146-9_1
The web changes everything, Proceedings of the Second ACM International Conference on Web Search and Data Mining, WSDM '09, 2009. ,
DOI : 10.1145/1498759.1498837
Extracting lists of data records from semi-structured web pages, Data & Knowledge Engineering, vol.64, issue.2, 2008. ,
DOI : 10.1016/j.datak.2007.10.002
Semantic deep Web: automatic attribute extraction from the deep Web data sources, Proc. SAC, 2007. ,
Enriching ontology for deep Web search, Proc. DEXA, 2008. ,
Extracting structured data from Web pages, Proceedings of the 2003 ACM SIGMOD international conference on on Management of data , SIGMOD '03, 2003. ,
DOI : 10.1145/872757.872799
A fast HTML web page change detection approach based on hashing and reducing the number of similarity computations, Data & Knowledge Engineering, vol.66, issue.2, 2008. ,
DOI : 10.1016/j.datak.2008.04.003
Approximate matching of hierarchical data using pq-grams, Proc. VLDB, 2005. ,
Web dynamics, structure, and page quality, 2004. ,
DOI : 10.1007/978-3-662-10874-1_5
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.59.3769
SourceRank: Relevance and trust assessment for deep Web sources based on inter-source agreement, Proc. WWW, 2011. ,
Template detection via data mining and its applications, Proceedings of the eleventh international conference on World Wide Web , WWW '02, 2002. ,
DOI : 10.1145/511446.511522
Siphoning hidden-Web data through keyword-based interfaces, Art. J. Information and Data Management, vol.1, issue.1, 2004. ,
Tree-to-tree correction for document trees, 1995. ,
Exploiting Relation Extraction for Ontology Alignment, Proc. ISWC, 2010. ,
DOI : 10.1007/11574620_52
Parallel asynchronous Hungarian methods for the assignment problem, Art. INFORMS J. Computing, vol.5, issue.3, 1993. ,
DBpedia - A crystallization point for the Web of Data, Web Semantics: Science, Services and Agents on the World Wide Web, vol.7, issue.3, 2009. ,
DOI : 10.1016/j.websem.2009.07.002
Visual structure-based web page clustering and retrieval, Proceedings of the 19th international conference on World wide web, WWW '10, 2010. ,
DOI : 10.1145/1772690.1772807
On the resemblance and containment of documents, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171), 1997. ,
DOI : 10.1109/SEQUEN.1997.666900
Syntactic clustering of the Web, Computer Networks and ISDN Systems, vol.29, issue.8-13, pp.8-13, 1997. ,
DOI : 10.1016/S0169-7552(97)00031-7
Indexing and querying segmented web pages: the block Web model, Art. World Wide Web, vol.14, pp.5-6, 2011. ,
A short survey of document structure similarity algorithms, Proc. International Conference on Internet Computing, 2004. ,
A fully automated object extraction system for the World Wide Web, Proceedings 21st International Conference on Distributed Computing Systems, 2001. ,
DOI : 10.1109/ICDSC.2001.918966
VIPS: a vision-based page segmentation algorithm, 2003. ,
An efficient language-independent method to extract content from news Webpages, Proc. DocEng, 2011. ,
Probe, cluster, and discover: focused extraction of QA-Pagelets from the deep Web, Proceedings. 20th International Conference on Data Engineering, 2004. ,
DOI : 10.1109/ICDE.2004.1319988
The paths more taken, Proceedings of the 19th international conference on World wide web, WWW '10, 2010. ,
DOI : 10.1145/1772690.1772713
A survey of Web information extraction systems, Art. IEEE Trans. on Knowl. and Data Eng, issue.10, p.18, 2006. ,
Meaningful change detection in structured data, Proc. SIGMOD, 1997. ,
Change detection in hierarchically structured information, Proc. ACM, 1996. ,
Semantic relation extraction from sociallygenerated tags: a methodology for metadata generation, Proc. DC, 2008. ,
The evolution of the Web and implications for an incremental crawler, Proc. VLDB, 2000. ,
Estimating frequency of change, ACM Transactions on Internet Technology, vol.3, issue.3, 2003. ,
DOI : 10.1145/857166.857170
Gimme' the context, Proceedings of the 14th international conference on World Wide Web , WWW '05, 2005. ,
DOI : 10.1145/1060745.1060796
Concerning Etags and datestamps, Proc. IWAW, 2004. ,
A Comparative Study of XML Change Detection Algorithms, Service and Business Computing Solutions with XML. IGI Global, 2009. ,
DOI : 10.4018/978-1-60566-330-2.ch002
Detecting changes in XML documents, Proceedings 18th International Conference on Data Engineering, 2002. ,
DOI : 10.1109/ICDE.2002.994696
Roadrunner: Towards automatic data extraction from large Web sites, Proc. VLDB, 2001. ,
Clustering Web pages based on their structure, Art. Data and Knowledge Engineering, vol.54, issue.3, 2005. ,
OntoMiner, Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters , WWW Alt. '04, 2004. ,
DOI : 10.1145/1013367.1013545
Automatic Web news extraction using tree edit distance, Proc. WWW, 2004. ,
Understanding narrative interest: Some evidence on the role of unexpectedness, Proc. CogSci, 2009. ,
URL : https://hal.archives-ouvertes.fr/hal-00479573
Building the schema of a Web entity dynamically, Art. Journal of Computational Information Systems, 2011. ,
Rate of change and other metrics: a live study of the World Wide Web, Proc. USITS, 1997. ,
Yih-Farn Chen, and Elefherios Koutsofios. The AT&T Internet difference engine: Tracking and viewing changes on the Web, World Wide Web, vol.1, issue.1, 1998. ,
Ontology-Based Focused Crawling of Deep Web Sources, Proc. KSEM, 2007. ,
DOI : 10.1007/978-3-540-76719-0_51
A large-scale study of the evolution of Web pages, Proc. WWW, 2003. ,
Syndicating Web Sites with RSS Feeds for Dummies, 2005. ,
Efficient and effective Web page change detection, Art. Data and Knowledge Engineering, vol.46, issue.2, 2007. ,
Real understanding of real estate forms, Proceedings of the International Conference on Web Intelligence, Mining and Semantics, WIMS '11, 2011. ,
DOI : 10.1145/1988688.1988704
How the Minotaur Turned into Ariadne: Ontologies in Web Data Extraction, Proc. ICWE, 2011. ,
DOI : 10.1007/978-3-642-22233-7_2
Forms form patterns: reusable form understanding, Proc. WWW, 2012. ,
Automatically learning gazetteers from the deep web, Proceedings of the 21st international conference companion on World Wide Web, WWW '12 Companion, 2012. ,
DOI : 10.1145/2187980.2188044
The volume and evolution of web page templates, Special interest tracks and posters of the 14th international conference on World Wide Web , WWW '05, 2005. ,
DOI : 10.1145/1062745.1062763
In Search of the Lost Schema, Proc. ICDT, 1999. ,
DOI : 10.1007/3-540-49257-7_20
XRANK, Proceedings of the 2003 ACM SIGMOD international conference on on Management of data , SIGMOD '03, 2003. ,
DOI : 10.1145/872757.872762
Discovering complex matchings across Web query interfaces: A correlation mining approach, Proc. KDD, 2004. ,
Schema Matching across Query Interfaces on the Deep Web, Proc. BNCOD, 2008. ,
DOI : 10.1007/978-3-540-70504-8_6
Sampling, information extraction and summarisation of Hidden Web databases, Data & Knowledge Engineering, vol.59, issue.2, 2006. ,
DOI : 10.1016/j.datak.2006.01.009
Heterogeneous Web data search using relevance-based on the fly data integration, Proc. WWW, 2012. ,
Implementing Preservation Strategies for Complex Multimedia Objects, Proc. ECDL, 2003. ,
DOI : 10.1007/978-3-540-45175-4_43
Application of Kalman filters to identify unexpected change in blogs, Proc. JCDL, 2008. ,
Distributed search over the hidden Web: hierarchical database sampling and selection, Proc. VLDB, 2002. ,
WebVigiL: An Approach to Just-In-Time Information Propagation in Large Network-Centric Environments, 2004. ,
DOI : 10.1007/978-3-662-10874-1_13
CX-DIFF: a change detection algorithm for XML content and change visualization for WebVigiL, Data & Knowledge Engineering, vol.52, issue.2, 2005. ,
DOI : 10.1016/S0169-023X(04)00102-8
Detecting age of page content, Proceedings of the 9th annual ACM international workshop on Web information and data management , WIDM '07, 2007. ,
DOI : 10.1145/1316902.1316925
WISDOM: Web intrapage informative structure mining based on document object model, IEEE Transactions on Knowledge and Data Engineering, vol.17, issue.5, pp.614-627, 2005. ,
DOI : 10.1109/TKDE.2005.84
A Novel Approach for Web Page Change Detection System, International Journal of Computer Theory and Engineering, vol.2, issue.3, 2010. ,
DOI : 10.7763/IJCTE.2010.V2.168
Understanding deep web search interfaces, Proc. SIGMOD, p.39, 2010. ,
DOI : 10.1145/1860702.1860708
An Efficient Web Page Change Detection System Based on an Optimized Hungarian Algorithm, IEEE Transactions on Knowledge and Data Engineering, vol.19, issue.5, 2007. ,
DOI : 10.1109/TKDE.2007.1014
Boilerplate detection using shallow text features, Proceedings of the third ACM international conference on Web search and data mining, WSDM '10, 2010. ,
DOI : 10.1145/1718487.1718542
The performance of universal encoding. Art, IEEE Transactions on Information Theory, vol.27, issue.2, pp.199-206, 1981. ,
Accurate and efficient crawling the deep Web: Surfacing hidden value, Art. International J. Computer Science and Information Security, vol.9, issue.5, 2011. ,
Wrapper induction for information extraction, Proc. IJCAI, 1997. ,
An efficient algorithm to compute differences between structured documents, IEEE Trans. on Knowl. and Data Eng, issue.8, p.16, 2004. ,
Wanli Zuo, and Fengling He. Ontology based automatic attributes extracting and queries translating for deep Web, Art. J. Software, vol.5, 2008. ,
An automated change-detection algorithm for HTML documents based on semantic hierarchies, Proc. ICDE, 2001. ,
Annotating and searching web tables using entities, types and relationships, Proc. VLDB, 2010. ,
DOI : 10.14778/1920841.1921005
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.174.5615
ROUGE: a package for automatic evaluation of summaries, Proc. Workshop on Text Summarization Branches Out (WAS), 2004. ,
NET ??? A System for Extracting Web Data from Flat and Nested Data Records, Proc. WISE, 2005. ,
DOI : 10.1007/11581062_39
Mining data records in Web pages, Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '03, 2003. ,
DOI : 10.1145/956750.956826
-detecting and delivering information changes on the web, Proceedings of the ninth international conference on Information and knowledge management , CIKM '00, 2000. ,
DOI : 10.1145/354756.354860
URL : https://hal.archives-ouvertes.fr/tel-00259428
Temporal Processing of News: Annotation of Temporal Expressions, Verbal Events and Temporal Relations ,
Ontology-driven information extraction with OntoSyphon, Proc. ISWC, 2006. ,
Extracting semantic structure of Web documents using content and visual information, Proc. WWW, 2005. ,
Extracting data records from the web using tag path clustering, Proceedings of the 18th international conference on World wide web, WWW '09, 2009. ,
DOI : 10.1145/1526709.1526841
A guided tour to approximate string matching, ACM Computing Surveys, vol.33, issue.1, 2001. ,
DOI : 10.1145/375360.375365
Extracting schema from semistructured data, Proc. SIGMOD, 1998. ,
What's new on the Web? the evolution of the Web from a search engine perspective, Proc. WWW, 2004. ,
Using neighbors to date web documents, Proceedings of the 9th annual ACM international workshop on Web information and data management , WIDM '07, 2007. ,
DOI : 10.1145/1316902.1316924
URL : http://hdl.handle.net/10216/5255
Archivage du contenu éphémère du Web à l'aide des flux Web, Proc. BDA Conference without formal proceedings. (Demonstration), 2010. ,
Archiving data objects using Web feeds, Proc. IWAW, 2010. ,
URL : https://hal.archives-ouvertes.fr/inria-00537962
Deriving dynamics of Web pages: A survey, Proc. TWAW, 2011. ,
URL : https://hal.archives-ouvertes.fr/inria-00588715
FOREST, Proceedings of the 18th International Workshop on Web and Databases, WebDB'15, 2012. ,
DOI : 10.1145/2767109.2767112
URL : https://hal.archives-ouvertes.fr/hal-01178402
Cross-fertilizing deep Web analysis and ontology enrichment, Proc. VLDS, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00737941
Extracting article text from the web with maximum subsequence segmentation, Proceedings of the 18th international conference on World wide web, WWW '09, 2009. ,
DOI : 10.1145/1526709.1526840
A novel Web archiving approach based on visual pages analysis, Proc. IWAW, 2009. ,
ArchivePress: A really simple solution to archiving blog content, Proc. iPRES, 2009. ,
Hidden-state Conditional Random Fields, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007. ,
Crawling the hidden Web, Proc. VLDB, 2001. ,
Automatic detection of fragments in dynamically generated web pages, Proceedings of the 13th conference on World Wide Web , WWW '04, 2004. ,
DOI : 10.1145/988672.988732
Exploiting WordNet as background knowledge, Proc. ISWC Ontology Matching (OM-07) Workshop, 2007. ,
Page Digest for large-scale Web services, IEEE International Conference on E-Commerce, 2003. CEC 2003., 2003. ,
DOI : 10.1109/COEC.2003.1210274
Hardness of string similarity search and other indexing problems, Proc. ICALP, 2004. ,
Automatic wrapper induction from hidden-web sources with domain knowledge, Proceeding of the 10th ACM workshop on Web information and data management, WIDM '08, 2008. ,
DOI : 10.1145/1458502.1458505
URL : https://hal.archives-ouvertes.fr/inria-00337098
Efficient Monitoring Algorithm for Fast News Alerts, IEEE Transactions on Knowledge and Data Engineering, vol.19, issue.7, 2007. ,
DOI : 10.1109/TKDE.2007.1041
Incremental crawling with Heritrix, Proc. IWAW, 2005. sitemaps.org. Sitemaps XML format, 2008. ,
Learning block importance models for web pages, Proceedings of the 13th conference on World Wide Web , WWW '04, 2004. ,
DOI : 10.1145/988672.988700
Catch me if you can. Temporal coherence of Web archives, Proc. IWAW, 2008. ,
A String Metric for Ontology Alignment, Proc. ISWC, 2005. ,
DOI : 10.1007/11574620_45
How to choose a digital preservation strategy, Proceedings of the 2007 conference on Digital libraries , JCDL '07, 2007. ,
DOI : 10.1145/1255175.1255181
ODE, ACM Transactions on Database Systems, vol.34, issue.2, 2009. ,
DOI : 10.1145/1538909.1538914
Combining linguistic and statistical analysis to extract relations from Web documents, Proc. KDD, 2006. ,
YAGO: A core of semantic knowledge unifying WordNet and Wikipedia, Proc. WWW, 2007. ,
PARIS: Probabilistic alignment of relations, instances, and schema, Proc. VLDB Endow, 2011. ,
Multiway SLCA-based keyword search in XML data, Proceedings of the 16th international conference on World Wide Web , WWW '07, 2007. ,
DOI : 10.1145/1242572.1242713
Dynamic Web file format transformations with Grace, Proc. IWAW, 2005. ,
Contextual and metadatabased approach for the semantic annotation of heterogeneous documents, Proc. SeMMA, 2008. ,
URL : https://hal.archives-ouvertes.fr/hal-00293255
Incremental Ontology-Based Extraction and Alignment in Semi-structured Documents, Proc. DEXA, 2009. ,
DOI : 10.1075/term.9.1.06dro
URL : https://hal.archives-ouvertes.fr/hal-00423575
Extracting result schema based on query instances in the deep Web, Art. Wuhan University J. Natural Sciences, vol.12, issue.5, 2007. ,
A fast and robust method for web page template detection and removal, Proceedings of the 15th ACM international conference on Information and knowledge management , CIKM '06, 2006. ,
DOI : 10.1145/1183614.1183654
Data extraction and label assignment for web databases, Proceedings of the twelfth international conference on World Wide Web , WWW '03, 2003. ,
DOI : 10.1145/775152.775179
Instance-based Schema Matching for Web Databases by Domain-specific Query Probing, Proc. VLDB, 2004. ,
DOI : 10.1016/B978-012088469-8.50038-3
Language-independent set expansion of named entities using the Web, Proc. ICDM, 2007. ,
X-Diff: an effective change detection algorithm for XML documents, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405), 2003. ,
DOI : 10.1109/ICDE.2003.1260818
Content extraction via tag ratios, Proc. WWW, 2010. ,
Merging interface schemas on the deep Web via clustering aggregation, Proc. Data Mining, 2005. ,
Bootstrapping Domain Ontology for Semantic Web Services from Source Web Sites, Proc. VLDB Workshop on Technologies for E-Services, 2005. ,
DOI : 10.1007/11607380_2
Change Detection in Web Pages, 10th International Conference on Information Technology (ICIT 2007), 2007. ,
DOI : 10.1109/ICIT.2007.37
Parallel crawler architecture and Web page change detection, Art. WSEAS Transactions on Computers, vol.7, issue.7, 2008. ,
Improving pseudo-relevance feedback in web information retrieval using web page segmentation, Proceedings of the twelfth international conference on World Wide Web , WWW '03, 2003. ,
DOI : 10.1145/775152.775155
Understanding the Search Interfaces of the Deep Web Based on Domain Model, 2009 Eighth IEEE/ACIS International Conference on Computer and Information Science, 2009. ,
DOI : 10.1109/ICIS.2009.32
Web data extraction based on partial tree alignment, Proceedings of the 14th international conference on World Wide Web , WWW '05, 2005. ,
DOI : 10.1145/1060745.1060761
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.66.277
Understanding Web query interfaces, Proceedings of the 2004 ACM SIGMOD international conference on Management of data , SIGMOD '04, 2004. ,
DOI : 10.1145/1007568.1007583
Simultaneous record detection and attribute labeling in web data extraction, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '06, 2006. ,
DOI : 10.1145/1150402.1150457