March 2010
Google Index to Go Real Time
Google is developing a system that will enable web publishers of any size to automatically submit new content to Google for indexing within seconds of that content being published. Search industry analyst Danny Sullivan told us today that this could be "the next chapter" for Google.
January 2010
Lucid Imagination » Apache Lucene Connector Framework now in Incubation at the ASF
(via)The Apache Lucene Connector Framework project has officially entered incubation. LCF, for short, is going to be a framework for connecting to content repositories like Sharepoint, Documentum, etc. and will make it easy to hook into Lucene, Solr, Nutch, Mahout, Tika, while, of course, remaining agnostic of the final destination of the data. See the Connectors website and the original proposal for more info. Help wanted!
June 2009
Digital Web Magazine - User Interface Implementations of Faceted Browsing
(via)Just as it is important to choose the proper knife when slicing-n-dicing vegetables, it is critical to prescribe a suitable user interface to support faceted filtering. Faceted filtering allows you to narrow down a large list of objects to a manageable size by applying flexible combinations of attribute filters in any order. Rather than forcing you down fixed paths within a website’s information architecture, faceted filtering allows you to multi-dimensionally slice-n-dice the information in a manner that best accommodates your specific needs. A user interface that optimally supports faceted filtering must expose its robust functionality in a way that expresses affordances, controls complexity, and follows existing standards that have been pre-established across the web.
August 2008
RIA et SEO - Fabien Deshayes - Du client riche à un Internet riche
(via)C'est une question qui revient souvent aux connaisseurs et acteurs de ce domaine : comment obtenir un bon référencement avec des technologies RIA (Ajax, Flash, Silverlight, etc.), alors que les grands moteurs de recherche ne peuvent pas facilement indexer leur contenu ? Il existe des solutions de rechanges, plus ou moins complexes et efficaces selon les technologies et les méthodes mises en place.
March 2008
Indexable File Formats
File Formats the Google Search Appliance and Google Mini Crawl and Index
The following table lists word processing, spreadsheet, database, presentation, and other formats that the Google Search Appliance and Google Mini can crawl, index, and search. Please note the following:
* The Google Mini and Google Search Appliance cannot crawl, index, or search any file formats that are not listed.
* Text embedded in graphics is not indexed.
The Google Search Appliance and Google Mini cannot index text contained in graphic file formats, such a JPEG, GIF, or TIFF. When a file in a graphic format is submitted for indexing, text embedded in the graphic is not indexed. However, the file name is indexed. If any metadata is associated with the graphic in HTML meta tags, that metadata is indexed.
* Encrypted, viewable PDF documents are converted to HTML for indexing, but the cached HTML is not displayed.
* PDF files created by scanning with optical character recognition (OCR) software are supported.
* If you are using the Google Search Appliance, metadata can be fed from a database and then indexed.
* Files in XML format cannot be crawled or indexed.
* The contents of compressed file formats, such as ZIP or tar files, cannot be indexed.
Solutions d'entreprise Google
Comparaison des fonctionnalités
(6 marks)