Skip navigation links
A B C D E F G H I J K L M N O P Q R S T U V W Y Z 

A

AbstractAnalyzer - Class in org.dice_research.squirrel.analyzer
Abstract class to define a constructor and UriCollector for analyzers.
AbstractAnalyzer(UriCollector) - Constructor for class org.dice_research.squirrel.analyzer.AbstractAnalyzer
 
AbstractBufferingSink - Class in org.dice_research.squirrel.sink.impl.sparql
An abstract implementation for TripleBasedSinks and QuadBasedSinks
AbstractBufferingSink() - Constructor for class org.dice_research.squirrel.sink.impl.sparql.AbstractBufferingSink
 
AbstractDecompressor - Class in org.dice_research.squirrel.analyzer.compress.impl
 
AbstractDecompressor() - Constructor for class org.dice_research.squirrel.analyzer.compress.impl.AbstractDecompressor
 
accept(CkanDataset) - Method in class org.dice_research.squirrel.analyzer.impl.ckan.CkanDatasetConsumer
This consumer method maps a single CkanDataset object to set of RDF triples.
acceptCharset - Variable in class org.dice_research.squirrel.fetcher.http.HTTPFetcher
The value which will be used for the HTTP Accept Charset header if the give CrawleableUri object does not define a header value.
ACCEPTED_SCHEMES - Static variable in class org.dice_research.squirrel.fetcher.ftp.FTPFetcher
 
ACCEPTED_SCHEMES - Static variable in class org.dice_research.squirrel.fetcher.http.HTTPFetcher
URI schemes which are accepted by this fetcher (i.e., "http" and "https").
acceptHeader - Variable in class org.dice_research.squirrel.fetcher.http.HTTPFetcher
The value which will be used for the HTTP Accept header if the give CrawleableUri object does not define a header value.
activityUri - Variable in class org.dice_research.squirrel.metadata.CrawlingActivity
A unique id.
ActivityUtil - Class in org.dice_research.squirrel.metadata
A simple utilities class for working with the CrawlingActivity objects.
ActivityUtil() - Constructor for class org.dice_research.squirrel.metadata.ActivityUtil
 
addContextToJSON(String) - Static method in class org.dice_research.squirrel.analyzer.impl.html.mf.MicroformatsAnalyzer
 
addContextToJSON(String) - Static method in class org.dice_research.squirrel.analyzer.impl.MicroformatMF2JAnalyzer
 
addData(CrawleableUri, InputStream) - Method in class org.dice_research.squirrel.sink.impl.file.FileBasedSink
 
addData(CrawleableUri, InputStream) - Method in class org.dice_research.squirrel.sink.impl.hdt.HdtBasedSink
 
addData(CrawleableUri, InputStream) - Method in class org.dice_research.squirrel.sink.impl.sparql.SparqlBasedSink
 
addData(CrawleableUri, InputStream) - Method in class org.dice_research.squirrel.sink.impl.sparql.TDBSink
 
addMetaData(Model) - Method in class org.dice_research.squirrel.sink.impl.sparql.TDBSink
 
addNewUri(CrawleableUri, CrawleableUri) - Method in class org.dice_research.squirrel.collect.SimpleUriCollector
 
addNewUri(CrawleableUri, CrawleableUri) - Method in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
addNewUri(CrawleableUri, CrawleableUri) - Method in interface org.dice_research.squirrel.collect.UriCollector
Adds the given new URI to the list of URIs collected for the given URI.
addNewUri(CrawleableUri, Node) - Method in interface org.dice_research.squirrel.collect.UriCollector
Adds the given new URI to the list of URIs collected for the given URI.
addNewUri(CrawleableUri, String) - Method in interface org.dice_research.squirrel.collect.UriCollector
Adds the given new URI to the list of URIs collected for the given URI.
addNewUri(CrawleableUri) - Method in class org.dice_research.squirrel.components.WorkerComponent
 
addNewUris(List<CrawleableUri>) - Method in class org.dice_research.squirrel.components.WorkerComponent
 
addNonLiteral(String, String, String) - Method in class org.dice_research.squirrel.analyzer.commons.SquirrelClerezzaSink
 
addOutputResource(String, Resource) - Method in class org.dice_research.squirrel.metadata.CrawlingActivity
 
addPlainLiteral(String, String, String, String) - Method in class org.dice_research.squirrel.analyzer.commons.SquirrelClerezzaSink
 
addQuad(CrawleableUri, Quad) - Method in class org.dice_research.squirrel.sink.impl.file.FileBasedSink
 
addQuad(CrawleableUri, Quad) - Method in class org.dice_research.squirrel.sink.impl.sparql.AbstractBufferingSink
 
addQuad(AbstractBufferingSink, CrawleableUri, Quad) - Method in class org.dice_research.squirrel.sink.impl.sparql.QuadBuffer
 
addShutdownHook() - Static method in class org.dice_research.squirrel.components.WorkerComponentStarter
 
addStep(CrawleableUri, Class<?>, String...) - Static method in class org.dice_research.squirrel.metadata.ActivityUtil
A simple method which attaches a step with the given Class and the given actions to the CrawlingActivity of the given URI if it exists.
addStep(CrawleableUri, Class<?>) - Static method in class org.dice_research.squirrel.metadata.ActivityUtil
A simple method which attaches a step with the given Class to the CrawlingActivity of the given URI if it exists.
addStep(Class<?>, String...) - Method in class org.dice_research.squirrel.metadata.CrawlingActivity
 
addTriple(String, String, String) - Method in class org.dice_research.squirrel.analyzer.commons.SquirrelClerezzaSink
Callback method for handling Clerezza triples.
addTriple(CrawleableUri, Triple) - Method in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
addTriple(CrawleableUri, Triple) - Method in interface org.dice_research.squirrel.collect.UriCollector
Adds the given triple to the list of URIs collected from the given URI.
addTriple(CrawleableUri, Triple) - Method in class org.dice_research.squirrel.sink.impl.file.FileBasedSink
 
addTriple(CrawleableUri, Triple) - Method in class org.dice_research.squirrel.sink.impl.hdt.HdtBasedSink
 
addTriple(CrawleableUri, Triple) - Method in class org.dice_research.squirrel.sink.impl.sparql.AbstractBufferingSink
 
addTriple(AbstractBufferingSink, CrawleableUri, Triple) - Method in class org.dice_research.squirrel.sink.impl.sparql.TripleBuffer
 
addTypedLiteral(String, String, String, String) - Method in class org.dice_research.squirrel.analyzer.commons.SquirrelClerezzaSink
 
addUri(CrawleableUri, Node) - Method in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
addUri(String, byte[]) - Method in class org.dice_research.squirrel.collect.SqlBasedUriCollector.UriTableStatus
 
analyze(CrawleableUri, File, Sink) - Method in interface org.dice_research.squirrel.analyzer.Analyzer
 
analyze(CrawleableUri, File, Sink) - Method in class org.dice_research.squirrel.analyzer.impl.ckan.CkanJsonAnalyzer
 
analyze(CrawleableUri, File, Sink) - Method in class org.dice_research.squirrel.analyzer.impl.HDTAnalyzer
 
analyze(CrawleableUri, File, Sink) - Method in class org.dice_research.squirrel.analyzer.impl.html.mf.MicroformatsAnalyzer
 
analyze(CrawleableUri, File, Sink) - Method in class org.dice_research.squirrel.analyzer.impl.html.scraper.HTMLScraperAnalyzer
 
analyze(CrawleableUri, File, Sink) - Method in class org.dice_research.squirrel.analyzer.impl.JsonAnalyzer
 
analyze(CrawleableUri, File, Sink) - Method in class org.dice_research.squirrel.analyzer.impl.MicrodataAnalyzer
 
analyze(CrawleableUri, File, Sink) - Method in class org.dice_research.squirrel.analyzer.impl.MicroformatMF2JAnalyzer
 
analyze(CrawleableUri, File, Sink) - Method in class org.dice_research.squirrel.analyzer.impl.RDFaAnalyzer
 
analyze(CrawleableUri, File, Sink) - Method in class org.dice_research.squirrel.analyzer.impl.RDFAnalyzer
 
analyze(CrawleableUri, File, Sink) - Method in class org.dice_research.squirrel.analyzer.manager.SimpleAnalyzerManager
It iterates over all the Analyzers created and added to the analyzers map and check if they are eligible to be executed by invoking isEligible().
analyze(CrawleableUri, File, Sink) - Method in class org.dice_research.squirrel.analyzer.manager.SimpleOrderedAnalyzerManager
Deprecated.
 
Analyzer - Interface in org.dice_research.squirrel.analyzer
 
analyzer - Variable in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
analyzers - Variable in class org.dice_research.squirrel.analyzer.manager.SimpleAnalyzerManager
 
analyzers - Variable in class org.dice_research.squirrel.analyzer.manager.SimpleOrderedAnalyzerManager
Deprecated.
 
attempts - Variable in class org.dice_research.squirrel.sink.impl.sparql.SparqlBasedSink
 

B

bnodeMap - Variable in class org.dice_research.squirrel.analyzer.commons.SquirrelClerezzaSink
 
buffer - Variable in class org.dice_research.squirrel.collect.SqlBasedUriCollector.UriTableStatus
 
buffer - Variable in class org.dice_research.squirrel.sink.impl.sparql.QuadBuffer
 
buffer - Variable in class org.dice_research.squirrel.sink.impl.sparql.TripleBuffer
 
bufferSize - Variable in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
bufferSize - Variable in class org.dice_research.squirrel.collect.SqlBasedUriCollector.UriTableStatus
 
bufferSize - Variable in class org.dice_research.squirrel.sink.impl.sparql.AbstractBufferingSink
 
bufferSize - Variable in class org.dice_research.squirrel.sink.impl.sparql.QuadBuffer
 
bufferSize - Variable in class org.dice_research.squirrel.sink.impl.sparql.TripleBuffer
 
buildRDFStateMachine() - Static method in class org.dice_research.squirrel.analyzer.mime.RdfAutomata
Builds a finite state machine to validate a simple RDF file
buildTurtleStateMachine() - Static method in class org.dice_research.squirrel.analyzer.mime.TurtleAutomata
Builds a finite state machine to validate a simple RDF file
BzipDecompressor - Class in org.dice_research.squirrel.analyzer.compress.impl
Decompression implementation for the BZip format
BzipDecompressor() - Constructor for class org.dice_research.squirrel.analyzer.compress.impl.BzipDecompressor
 

C

canStop() - Method in interface org.dice_research.squirrel.analyzer.mime.FiniteStateMachine
Is the current state a final one?
canStop() - Method in class org.dice_research.squirrel.analyzer.mime.RdfAutomata
 
canStop() - Method in class org.dice_research.squirrel.analyzer.mime.TurtleAutomata
 
checkForUriType - Variable in class org.dice_research.squirrel.fetcher.ckan.java.SimpleCkanFetcher
 
checkForUriType - Variable in class org.dice_research.squirrel.fetcher.sparql.SparqlDatasetFetcher
 
CKAN_JSON_OBJECT_MIME_TYPE - Static variable in class org.dice_research.squirrel.fetcher.ckan.java.SimpleCkanFetcher
 
CKAN_WHITELIST_FILE - Static variable in class org.dice_research.squirrel.configurator.CkanWhiteListConfiguration
 
CKAN_WHITELIST_FILE - Static variable in class org.dice_research.squirrel.worker.impl.WorkerImpl
Deprecated.
ckandata(String) - Static method in class org.dice_research.squirrel.worker.impl.WorkerImpl
Deprecated.
CkanDatasetConsumer - Class in org.dice_research.squirrel.analyzer.impl.ckan
A simple consumer of CkanDataset objects transforming them into RDF triples and writing the triples to the given Sink and UriCollector.
CkanDatasetConsumer(Sink, UriCollector, CrawleableUri, TripleEncoder) - Constructor for class org.dice_research.squirrel.analyzer.impl.ckan.CkanDatasetConsumer
 
CkanJsonAnalyzer - Class in org.dice_research.squirrel.analyzer.impl.ckan
This Analyzer implements the processing of JSON result objects representing CKAN datasets.
CkanJsonAnalyzer(UriCollector) - Constructor for class org.dice_research.squirrel.analyzer.impl.ckan.CkanJsonAnalyzer
 
ckanwhitelist() - Method in class org.dice_research.squirrel.worker.impl.WorkerImpl
Deprecated.
CkanWhiteListConfiguration - Class in org.dice_research.squirrel.configurator
The CKAN URL list for CKAN Crawler
CkanWhiteListConfiguration(String) - Constructor for class org.dice_research.squirrel.configurator.CkanWhiteListConfiguration
 
ckanwhiteListURI - Variable in class org.dice_research.squirrel.configurator.CkanWhiteListConfiguration
 
client - Variable in class org.dice_research.squirrel.components.WorkerComponent
 
client() - Method in class org.dice_research.squirrel.components.WorkerComponentConfig
 
client - Variable in class org.dice_research.squirrel.fetcher.deref.DereferencingFetcher
Deprecated.
 
client - Variable in class org.dice_research.squirrel.fetcher.http.HTTPFetcher
The HTTP client instance used by this feature.
clientFrontier - Variable in class org.dice_research.squirrel.components.WorkerComponent
 
clone() - Method in class org.dice_research.squirrel.analyzer.impl.html.scraper.YamlFile
 
close() - Method in class org.dice_research.squirrel.analyzer.commons.SquirrelTripleHandler
 
close() - Method in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
close() - Method in class org.dice_research.squirrel.components.WorkerComponent
 
close() - Method in class org.dice_research.squirrel.fetcher.ckan.java.SimpleCkanFetcher
 
close() - Method in class org.dice_research.squirrel.fetcher.ckan.py.MicroServiceBasedCkanFetcher
 
close() - Method in class org.dice_research.squirrel.fetcher.deref.DereferencingFetcher
Deprecated.
 
close() - Method in class org.dice_research.squirrel.fetcher.dump.DumpFetcher
Deprecated.
 
close() - Method in class org.dice_research.squirrel.fetcher.ftp.FTPFetcher
 
close() - Method in class org.dice_research.squirrel.fetcher.http.HTTPFetcher
 
close() - Method in class org.dice_research.squirrel.fetcher.manage.SimpleOrderedFetcherManager
 
close() - Method in class org.dice_research.squirrel.fetcher.sparql.JsonFetcher
 
close() - Method in class org.dice_research.squirrel.fetcher.sparql.SparqlBasedFetcher
 
close() - Method in class org.dice_research.squirrel.fetcher.sparql.SparqlDatasetFetcher
 
close() - Method in class org.dice_research.squirrel.sink.impl.file.FileBasedSink.StreamStatus
 
close() - Method in class org.dice_research.squirrel.sink.impl.sparql.SparqlBasedSink
 
close() - Method in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
closeComponent() - Static method in class org.dice_research.squirrel.components.WorkerComponentStarter
 
closeContext(ExtractionContext) - Method in class org.dice_research.squirrel.analyzer.commons.SquirrelTripleHandler
 
closed - Static variable in class org.dice_research.squirrel.components.WorkerComponentStarter
 
closeSinkForUri(CrawleableUri) - Method in class org.dice_research.squirrel.collect.SimpleUriCollector
 
closeSinkForUri(CrawleableUri) - Method in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
closeSinkForUri(CrawleableUri) - Method in class org.dice_research.squirrel.sink.impl.file.FileBasedSink
 
closeSinkForUri(CrawleableUri) - Method in class org.dice_research.squirrel.sink.impl.hdt.HdtBasedSink
Recovers the temp file generated and parse it to hdt
closeSinkForUri(CrawleableUri) - Method in class org.dice_research.squirrel.sink.impl.sparql.AbstractBufferingSink
 
closeSinkForUri(CrawleableUri) - Method in class org.dice_research.squirrel.sink.impl.sparql.SparqlBasedSink
 
collector - Variable in class org.dice_research.squirrel.analyzer.AbstractAnalyzer
 
collector - Variable in class org.dice_research.squirrel.analyzer.commons.FilterSinkRDF
 
collector - Variable in class org.dice_research.squirrel.analyzer.commons.SquirrelClerezzaSink
 
collector - Variable in class org.dice_research.squirrel.analyzer.commons.SquirrelTripleHandler
 
collector - Variable in class org.dice_research.squirrel.analyzer.impl.ckan.CkanDatasetConsumer
 
collector - Variable in class org.dice_research.squirrel.analyzer.impl.ckan.CkanJsonAnalyzer
 
collector - Variable in class org.dice_research.squirrel.analyzer.impl.HDTAnalyzer
 
collector - Variable in class org.dice_research.squirrel.analyzer.impl.html.mf.MicroformatsAnalyzer
 
collector - Variable in class org.dice_research.squirrel.analyzer.impl.html.scraper.HTMLScraperAnalyzer
 
collector - Variable in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
commitPendingChanges() - Method in class org.dice_research.squirrel.collect.SqlBasedUriCollector.UriTableStatus
 
component - Static variable in class org.dice_research.squirrel.components.WorkerComponentStarter
 
connect(CrawleableUri, UriCollector, Sink) - Static method in class org.dice_research.squirrel.analyzer.commons.SquirrelClerezzaSink
 
context - Static variable in class org.dice_research.squirrel.components.WorkerComponentStarter
 
convertNonLiteral(String) - Method in class org.dice_research.squirrel.analyzer.commons.SquirrelClerezzaSink
 
COUNT_URIS_QUERY - Static variable in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
crawl(List<CrawleableUri>) - Method in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
CrawlingActivity - Class in org.dice_research.squirrel.metadata
Representation of Crawling activity.
CrawlingActivity(CrawleableUri, String) - Constructor for class org.dice_research.squirrel.metadata.CrawlingActivity
Constructor.
CrawlingActivity.CrawlingURIState - Enum in org.dice_research.squirrel.metadata
 
crawlingDone(List<CrawleableUri>) - Method in class org.dice_research.squirrel.components.WorkerComponent
 
CrawlingURIState() - Constructor for enum org.dice_research.squirrel.metadata.CrawlingActivity.CrawlingURIState
 
create(String) - Method in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
create(String, Connection, int) - Static method in class org.dice_research.squirrel.collect.SqlBasedUriCollector.UriTableStatus
 
create(String) - Static method in class org.dice_research.squirrel.sink.impl.sparql.SparqlBasedSink
 
create(String, String, String, int, int) - Static method in class org.dice_research.squirrel.sink.impl.sparql.SparqlBasedSink
 
CREATE_TABLE_QUERY - Static variable in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
createDatasetResource(CkanDataset, Map<String, String>) - Method in class org.dice_research.squirrel.analyzer.impl.ckan.CkanDatasetConsumer
 
createModelFromJSONLD(String) - Static method in class org.dice_research.squirrel.analyzer.impl.html.mf.MicroformatsAnalyzer
Creates a Model from JSON-LD
createModelFromJSONLD(String) - Static method in class org.dice_research.squirrel.analyzer.impl.MicroformatMF2JAnalyzer
Creates a Model from JSON-LD
createOutputFile() - Method in class org.dice_research.squirrel.analyzer.compress.impl.AbstractDecompressor
 
createStream(String, boolean) - Method in class org.dice_research.squirrel.sink.impl.file.FileBasedSink.StreamStatus
 
curi - Variable in class org.dice_research.squirrel.analyzer.commons.FilterSinkRDF
 
curi - Variable in class org.dice_research.squirrel.analyzer.commons.SquirrelClerezzaSink
 
curi - Variable in class org.dice_research.squirrel.analyzer.commons.SquirrelTripleHandler
 
curi - Variable in class org.dice_research.squirrel.analyzer.impl.ckan.CkanDatasetConsumer
 
curiString - Variable in class org.dice_research.squirrel.analyzer.impl.ckan.CkanDatasetConsumer
 
current - Variable in class org.dice_research.squirrel.analyzer.mime.RdfAutomata
Current state.
current - Variable in class org.dice_research.squirrel.analyzer.mime.TurtleAutomata
Current state.

D

dataDirectory - Variable in class org.dice_research.squirrel.fetcher.ckan.java.SimpleCkanFetcher
 
dataDirectory - Variable in class org.dice_research.squirrel.fetcher.ftp.FTPFetcher
 
dataDirectory - Variable in class org.dice_research.squirrel.fetcher.http.HTTPFetcher
The temporary directory which will be used to store downloaded data.
dataDirectory - Variable in class org.dice_research.squirrel.fetcher.sparql.SparqlBasedFetcher
 
dataDirectory - Variable in class org.dice_research.squirrel.fetcher.sparql.SparqlDatasetFetcher
 
dataOutputStream - Variable in class org.dice_research.squirrel.sink.impl.file.FileBasedSink.StreamStatus
 
dataSetQuery - Variable in class org.dice_research.squirrel.fetcher.sparql.SparqlDatasetFetcher
The delay that the system will have between sending two queries.
dateEnded - Variable in class org.dice_research.squirrel.metadata.CrawlingActivity
When the activity has ended.
dateStarted - Variable in class org.dice_research.squirrel.metadata.CrawlingActivity
When the activity has started.
dbConnection - Variable in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
decompress(CrawleableUri, File) - Method in interface org.dice_research.squirrel.analyzer.compress.Decompressor
 
decompress(CrawleableUri, File) - Method in class org.dice_research.squirrel.analyzer.compress.impl.BzipDecompressor
 
decompress(CrawleableUri, File) - Method in class org.dice_research.squirrel.analyzer.compress.impl.GzDecompressor
 
decompress(CrawleableUri, File) - Method in class org.dice_research.squirrel.analyzer.compress.impl.SevenZipDecompressor
 
decompress(CrawleableUri, File) - Method in class org.dice_research.squirrel.analyzer.compress.impl.TarDecompressor
 
decompress(CrawleableUri, File) - Method in class org.dice_research.squirrel.analyzer.compress.impl.ZipDecompressor
 
decompressFile(CrawleableUri, File) - Method in class org.dice_research.squirrel.analyzer.compress.impl.FileManager
 
Decompressor - Interface in org.dice_research.squirrel.analyzer.compress
Interface for a Decompressor class
deduplicationActive - Variable in class org.dice_research.squirrel.components.WorkerComponent
Indicates whether deduplication is active.
DEFAULT_ACCEPT_HEADER_STRING - Static variable in class org.dice_research.squirrel.fetcher.http.HTTPFetcher
The default HTTP Accept header value which simply accepts everything.
DEFAULT_BUFFER_SIZE - Static variable in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
DEFAULT_BUFFER_SIZE - Static variable in class org.dice_research.squirrel.sink.impl.sparql.AbstractBufferingSink
Interval that specifies how many triples are to be buffered at once until they are sent to the sink.
DEFAULT_MIN_WAITING_TIME - Static variable in class org.dice_research.squirrel.robots.RobotsManagerImpl
 
DEFAULT_OUTPUT_LANG - Static variable in class org.dice_research.squirrel.sink.impl.file.FileBasedSink
 
DEFAULT_TIMEOUT - Variable in class org.dice_research.squirrel.fetcher.ckan.java.PaginatedCkanFetcher
 
DEFAULT_WAITING_TIME - Static variable in class org.dice_research.squirrel.worker.impl.WorkerImpl
Deprecated.
defaultMinWaitingTime - Variable in class org.dice_research.squirrel.robots.RobotsManagerImpl
 
delay - Variable in class org.dice_research.squirrel.fetcher.delay.StaticDelayer
The delay (in ms) that should be introduced between two requests.
delay - Variable in class org.dice_research.squirrel.fetcher.sparql.SparqlDatasetFetcher
 
delay - Variable in class org.dice_research.squirrel.sink.impl.sparql.SparqlBasedSink
 
Delayer - Interface in org.dice_research.squirrel.fetcher.delay
A delayer can be used to make sure that a server is flooded with requests, i.e., a fetcher is aware of doing delays between single requests without programming the delay functionality in all fetcher instances.
deleteTriples() - Method in class org.dice_research.squirrel.sink.impl.sparql.SparqlBasedSink
 
DereferencingFetcher - Class in org.dice_research.squirrel.fetcher.deref
Deprecated.
Use the HTTPFetcher instead.
DereferencingFetcher() - Constructor for class org.dice_research.squirrel.fetcher.deref.DereferencingFetcher
Deprecated.
 
detectFileType(CrawleableUri, String) - Method in class org.dice_research.squirrel.fetcher.dump.DumpFetcher
Deprecated.
 
detectMimeType(File) - Method in class org.dice_research.squirrel.analyzer.compress.impl.FileManager
 
detectMimeType(File) - Method in class org.dice_research.squirrel.analyzer.mime.MimeTypeDetector
 
detectMimeType(File) - Method in interface org.dice_research.squirrel.analyzer.mime.TypeDetector
 
doc - Variable in class org.dice_research.squirrel.analyzer.impl.html.scraper.HtmlScraper
 
doesRecrawling() - Method in class org.dice_research.squirrel.components.WorkerComponent
 
domainLogFile - Variable in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
downloadFile(CrawleableUri, String) - Method in class org.dice_research.squirrel.fetcher.dump.DumpFetcher
Deprecated.
 
DROP_TABLE_QUERY - Static variable in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
DummyDelayer - Class in org.dice_research.squirrel.fetcher.delay
A dummy instance of the Delayer interface
DummyDelayer() - Constructor for class org.dice_research.squirrel.fetcher.delay.DummyDelayer
 
DumpFetcher - Class in org.dice_research.squirrel.fetcher.dump
Deprecated.
Use the HTTPFetcher instead.
DumpFetcher() - Constructor for class org.dice_research.squirrel.fetcher.dump.DumpFetcher
Deprecated.
 

E

ElementNotFoundException - Exception in org.dice_research.squirrel.analyzer.impl.html.scraper.exceptions
 
ElementNotFoundException(String) - Constructor for exception org.dice_research.squirrel.analyzer.impl.html.scraper.exceptions.ElementNotFoundException
 
ENABLE_CKAN_CRAWLER_FORWARDING - Static variable in class org.dice_research.squirrel.worker.impl.WorkerImpl
Deprecated.
encoder - Variable in class org.dice_research.squirrel.analyzer.commons.FilterSinkRDF
 
encodeTriple(Triple) - Method in class org.dice_research.squirrel.encoder.TripleEncoder
Method that encode triple based on Jena escaping rules.
encodeUri(Node) - Method in class org.dice_research.squirrel.encoder.TripleEncoder
 
endDocument(IRI) - Method in class org.dice_research.squirrel.analyzer.commons.SquirrelTripleHandler
 
endStream() - Method in class org.dice_research.squirrel.analyzer.commons.SquirrelClerezzaSink
 
ERROR_EXIT_CODE - Static variable in class org.dice_research.squirrel.components.WorkerComponentStarter
Exit code that is used if the program has to terminate because of an internal error.
execute_unsecured() - Method in class org.dice_research.squirrel.collect.SqlBasedUriCollector.UriTableStatus
 
executeAsBatch_unsecured() - Method in class org.dice_research.squirrel.collect.SqlBasedUriCollector.UriTableStatus
Deprecated.
EXECUTION_SERVICE - Variable in class org.dice_research.squirrel.sink.impl.hdt.HdtBasedSink
 

F

failedParseAttempts - Variable in class org.dice_research.squirrel.analyzer.impl.RDFAnalyzer
 
fetch(CrawleableUri, Delayer) - Method in class org.dice_research.squirrel.fetcher.ckan.java.PaginatedCkanFetcher
 
fetch(CrawleableUri, Delayer) - Method in class org.dice_research.squirrel.fetcher.ckan.java.SimpleCkanFetcher
 
fetch(CrawleableUri, Delayer) - Method in class org.dice_research.squirrel.fetcher.ckan.py.MicroServiceBasedCkanFetcher
 
fetch(CrawleableUri, Delayer) - Method in class org.dice_research.squirrel.fetcher.deref.DereferencingFetcher
Deprecated.
 
fetch(CrawleableUri, Delayer) - Method in class org.dice_research.squirrel.fetcher.dump.DumpFetcher
Deprecated.
 
fetch(CrawleableUri) - Method in interface org.dice_research.squirrel.fetcher.Fetcher
Fetches a stream of data from the given URI, stores it and returns a File object pointing to the stored data.
fetch(CrawleableUri, Delayer) - Method in interface org.dice_research.squirrel.fetcher.Fetcher
Fetches a stream of data from the given URI by following the delay implemented by the given Delayer, stores it and returns a File object pointing to the stored data.
fetch(CrawleableUri, Delayer) - Method in class org.dice_research.squirrel.fetcher.ftp.FTPFetcher
 
fetch(CrawleableUri, Delayer) - Method in class org.dice_research.squirrel.fetcher.http.HTTPFetcher
 
fetch(CrawleableUri, Delayer) - Method in class org.dice_research.squirrel.fetcher.manage.SimpleOrderedFetcherManager
 
fetch(CrawleableUri, Delayer) - Method in class org.dice_research.squirrel.fetcher.sparql.JsonFetcher
 
fetch(CrawleableUri, Delayer) - Method in class org.dice_research.squirrel.fetcher.sparql.SparqlBasedFetcher
 
fetch(CrawleableUri, Delayer) - Method in class org.dice_research.squirrel.fetcher.sparql.SparqlDatasetFetcher
 
fetchDataset(CkanClient, int, int, OutputStream) - Method in class org.dice_research.squirrel.fetcher.ckan.java.PaginatedCkanFetcher
 
fetchDataset(CkanClient, String, OutputStream) - Method in class org.dice_research.squirrel.fetcher.ckan.java.SimpleCkanFetcher
 
Fetcher - Interface in org.dice_research.squirrel.fetcher
Interface of a class that fetches the data of a given CrawleableUri instance.
FETCHER - Static variable in class org.dice_research.squirrel.fetcher.manage.SimpleOrderedFetcherManager
 
fetcher - Variable in class org.dice_research.squirrel.robots.RobotsManagerImpl
 
fetcher - Variable in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
fetchers - Variable in class org.dice_research.squirrel.fetcher.manage.SimpleOrderedFetcherManager
 
file_descriptor - Variable in class org.dice_research.squirrel.analyzer.impl.html.scraper.YamlFile
 
FileBasedSink - Class in org.dice_research.squirrel.sink.impl.file
 
FileBasedSink(File, boolean) - Constructor for class org.dice_research.squirrel.sink.impl.file.FileBasedSink
 
FileBasedSink(File, Lang, boolean) - Constructor for class org.dice_research.squirrel.sink.impl.file.FileBasedSink
 
FileBasedSink.StreamStatus - Class in org.dice_research.squirrel.sink.impl.file
 
fileExtension - Variable in class org.dice_research.squirrel.analyzer.impl.html.scraper.YamlFilesParser
 
FileManager - Class in org.dice_research.squirrel.analyzer.compress.impl
Class responsible for detecting the fetched file mimetype and decompress the file if necessary
FileManager() - Constructor for class org.dice_research.squirrel.analyzer.compress.impl.FileManager
 
FilterSinkRDF - Class in org.dice_research.squirrel.analyzer.commons
RDF Filter to parse RDF Streams
FilterSinkRDF(CrawleableUri, Sink, UriCollector, TripleEncoder) - Constructor for class org.dice_research.squirrel.analyzer.commons.FilterSinkRDF
 
filterYamlFiles(List<File>) - Method in class org.dice_research.squirrel.analyzer.impl.html.scraper.YamlFilesParser
 
findLicense(CkanResource) - Method in class org.dice_research.squirrel.analyzer.impl.ckan.CkanDatasetConsumer
 
findTheme(List<CkanPair>) - Method in class org.dice_research.squirrel.analyzer.impl.ckan.CkanDatasetConsumer
 
finishActivity(Sink) - Method in class org.dice_research.squirrel.metadata.CrawlingActivity
Finish the crawling activity and send data to sink
FiniteStateMachine - Interface in org.dice_research.squirrel.analyzer.mime
Finite state machine.
flush() - Method in class org.dice_research.squirrel.sink.impl.file.FileBasedSink.StreamStatus
 
flushMetadata() - Method in class org.dice_research.squirrel.sink.impl.file.FileBasedSink
 
flushMetadata() - Method in class org.dice_research.squirrel.sink.impl.sparql.SparqlBasedSink
 
flushMetadata() - Method in class org.dice_research.squirrel.sink.impl.sparql.TDBSink
 
formatNodeToString(Node) - Static method in class org.dice_research.squirrel.sink.impl.sparql.QueryGenerator
Formats the node for a query
frontier - Variable in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
FTPFetcher - Class in org.dice_research.squirrel.fetcher.ftp
A simple fetcher using the FTP protocol.
FTPFetcher() - Constructor for class org.dice_research.squirrel.fetcher.ftp.FTPFetcher
 
FTPRecursiveFetcher - Class in org.dice_research.squirrel.fetcher.ftp
 
FTPRecursiveFetcher(Path) - Constructor for class org.dice_research.squirrel.fetcher.ftp.FTPRecursiveFetcher
 

G

generateFileName(CrawleableUri, Lang, boolean) - Static method in class org.dice_research.squirrel.sink.impl.file.FileBasedSink
 
generateTableName(String) - Static method in class org.dice_research.squirrel.collect.SqlBasedUriCollector
Generates a table name based on the given URI.
get() - Static method in class org.dice_research.squirrel.fetcher.delay.DummyDelayer
 
getAddQuery(Collection<Triple>) - Method in class org.dice_research.squirrel.sink.impl.sparql.QueryGenerator
Return an Add Query for the default uri and its triples.
getAddQuery(String, Collection<Triple>) - Method in class org.dice_research.squirrel.sink.impl.sparql.QueryGenerator
Return an Add Query for the given uri and its triples.
getAddQuery(String, Collection<Triple>, boolean) - Method in class org.dice_research.squirrel.sink.impl.sparql.QueryGenerator
Return an Add Query for the given uri or default graph and its triples.
getBNode(String) - Method in class org.dice_research.squirrel.analyzer.commons.SquirrelClerezzaSink
 
getCkanWhiteListConfiguration() - Static method in class org.dice_research.squirrel.configurator.CkanWhiteListConfiguration
 
getCkanWhiteListURI() - Method in class org.dice_research.squirrel.configurator.CkanWhiteListConfiguration
 
getCrawleableUri() - Method in class org.dice_research.squirrel.metadata.CrawlingActivity
 
getDataDirectory() - Method in class org.dice_research.squirrel.fetcher.ckan.java.SimpleCkanFetcher
 
getDataOutputStream() - Method in class org.dice_research.squirrel.sink.impl.file.FileBasedSink.StreamStatus
 
getDelay() - Method in interface org.dice_research.squirrel.fetcher.delay.Delayer
Returns the delay (in ms) that this instance typically would introduce between two consecutive requests.
getDelay() - Method in class org.dice_research.squirrel.fetcher.delay.DummyDelayer
 
getDelay() - Method in class org.dice_research.squirrel.fetcher.delay.StaticDelayer
 
getEnvCkanWhiteListURI() - Static method in class org.dice_research.squirrel.configurator.CkanWhiteListConfiguration
 
getEnvMinDelay() - Static method in class org.dice_research.squirrel.configurator.RobotsManagerConfiguration
 
getFetchers() - Method in class org.dice_research.squirrel.fetcher.manage.SimpleOrderedFetcherManager
 
getFile_descriptor() - Method in class org.dice_research.squirrel.analyzer.impl.html.scraper.YamlFile
 
getGraphId(CrawleableUri) - Static method in class org.dice_research.squirrel.sink.impl.sparql.SparqlBasedSink
Get the id of the graph in which the given uri is stored.
getGraphId(CrawleableUri) - Method in class org.dice_research.squirrel.sink.impl.sparql.TDBSink
Get the id of the graph in which the given uri is stored.
getHtmlScraperConfiguration() - Static method in class org.dice_research.squirrel.configurator.HtmlScraperConfiguration
 
getId() - Method in class org.dice_research.squirrel.worker.impl.WorkerImpl
Deprecated.
getInstance() - Static method in class org.dice_research.squirrel.encoder.TripleEncoder
 
getInstance() - Static method in class org.dice_research.squirrel.sink.impl.sparql.QueryGenerator
getMimeType() - Method in interface org.dice_research.squirrel.analyzer.mime.FiniteStateMachine
 
getMimeType() - Method in class org.dice_research.squirrel.analyzer.mime.RdfAutomata
 
getMimeType() - Method in class org.dice_research.squirrel.analyzer.mime.TurtleAutomata
 
getMinDelay() - Method in class org.dice_research.squirrel.configurator.RobotsManagerConfiguration
 
getMinWaitingTime(CrawleableUri) - Method in interface org.dice_research.squirrel.robots.RobotsManager
Returns the minimum time a crawler should wait before sending a new request to the given domain.
getMinWaitingTime(CrawleableUri) - Method in class org.dice_research.squirrel.robots.RobotsManagerImpl
 
getNextUris() - Method in class org.dice_research.squirrel.components.WorkerComponent
 
getNumberOfPendingUris() - Method in class org.dice_research.squirrel.components.WorkerComponent
 
getNumberOfQuads() - Method in class org.dice_research.squirrel.sink.impl.sparql.QuadBuffer
 
getNumberOfTriples() - Method in class org.dice_research.squirrel.metadata.CrawlingActivity
 
getNumberOfTriples() - Method in class org.dice_research.squirrel.sink.impl.sparql.TripleBuffer
 
getOutputFolder() - Method in class org.dice_research.squirrel.configurator.WorkerConfiguration
 
getPath() - Method in class org.dice_research.squirrel.configurator.HtmlScraperConfiguration
 
getRequestPermission() - Method in interface org.dice_research.squirrel.fetcher.delay.Delayer
Waits for the permission to perform a new request.
getRequestPermission() - Method in class org.dice_research.squirrel.fetcher.delay.DummyDelayer
 
getRequestPermission() - Method in class org.dice_research.squirrel.fetcher.delay.StaticDelayer
 
getRobotsManagerConfiguration() - Static method in class org.dice_research.squirrel.configurator.RobotsManagerConfiguration
 
getRules(CrawleableUri) - Method in class org.dice_research.squirrel.robots.RobotsManagerImpl
 
getSelectQuery() - Method in class org.dice_research.squirrel.sink.impl.sparql.QueryGenerator
Return a select query for the default graph.
getSelectQuery(String, boolean) - Method in class org.dice_research.squirrel.sink.impl.sparql.QueryGenerator
Return a select query for the given graphID or default graph.
getSelectQuery(String) - Method in class org.dice_research.squirrel.sink.impl.sparql.QueryGenerator
Return a select query for the given graphID.
getSize() - Method in class org.dice_research.squirrel.collect.SimpleUriCollector
Returns the total number of new URIs that have been added to this collector.
getSize(CrawleableUri) - Method in class org.dice_research.squirrel.collect.SimpleUriCollector
 
getSize() - Method in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
getSize(CrawleableUri) - Method in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
getSize(CrawleableUri) - Method in interface org.dice_research.squirrel.collect.UriCollector
Returns the total of uris that have been collected
getSparqlHost() - Method in class org.dice_research.squirrel.configurator.WorkerConfiguration
 
getSqarqlPort() - Method in class org.dice_research.squirrel.configurator.WorkerConfiguration
 
getState() - Method in class org.dice_research.squirrel.metadata.CrawlingActivity
 
getStepsAsString() - Method in class org.dice_research.squirrel.metadata.CrawlingActivity
 
getStream(CrawleableUri) - Method in class org.dice_research.squirrel.sink.impl.file.FileBasedSink
 
getTableName(CrawleableUri) - Static method in class org.dice_research.squirrel.collect.SqlBasedUriCollector
Retrieves the URIs table name from its properties or generates a new table name and adds it to the URI (using the "URI_COLLECTOR_TABLE_NAME" property).
getTableName() - Method in class org.dice_research.squirrel.collect.SqlBasedUriCollector.UriTableStatus
 
getTripleOutputStream() - Method in class org.dice_research.squirrel.sink.impl.file.FileBasedSink.StreamStatus
 
getTriplesForGraph(CrawleableUri) - Method in class org.dice_research.squirrel.sink.impl.sparql.SparqlBasedSink
 
getTriplesForGraph(CrawleableUri) - Method in class org.dice_research.squirrel.sink.impl.sparql.TDBSink
 
getUpdateDatasetURI() - Method in class org.dice_research.squirrel.sink.impl.sparql.TDBSink
 
getUri() - Method in class org.dice_research.squirrel.metadata.CrawlingActivity
 
getUri() - Method in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
getUris(CrawleableUri) - Method in class org.dice_research.squirrel.collect.SimpleUriCollector
 
getUris(CrawleableUri) - Method in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
getUris(CrawleableUri) - Method in interface org.dice_research.squirrel.collect.UriCollector
Returns a list of serialized CrawleableUri instances that have been collected for the given URI.
getWorkerConfiguration() - Static method in class org.dice_research.squirrel.configurator.WorkerConfiguration
 
getYamlFiles() - Method in class org.dice_research.squirrel.analyzer.impl.html.scraper.YamlFilesParser
 
getYamlFilesPath() - Static method in class org.dice_research.squirrel.configurator.HtmlScraperConfiguration
 
graphQuery - Variable in class org.dice_research.squirrel.fetcher.sparql.SparqlDatasetFetcher
 
GzDecompressor - Class in org.dice_research.squirrel.analyzer.compress.impl
Decompression implementation for the GZ format
GzDecompressor() - Constructor for class org.dice_research.squirrel.analyzer.compress.impl.GzDecompressor
 

H

hasNext() - Method in class org.dice_research.squirrel.fetcher.sparql.SparqlBasedFetcher.SelectedTriplesIterator
 
hasNext() - Method in class org.dice_research.squirrel.fetcher.sparql.SparqlDatasetFetcher.SelectedTriplesIterator
 
HDTAnalyzer - Class in org.dice_research.squirrel.analyzer.impl
Analyzer to parse HDTAnalyzer types
HDTAnalyzer(UriCollector) - Constructor for class org.dice_research.squirrel.analyzer.impl.HDTAnalyzer
 
HdtBasedSink - Class in org.dice_research.squirrel.sink.impl.hdt
HDT File based Sink, uses the FileBasedSink to store the triples of the URI in a temp folder and parses it to HDT when the sink is closed
HdtBasedSink(File) - Constructor for class org.dice_research.squirrel.sink.impl.hdt.HdtBasedSink
Creates a temp file for the FileBasedsink storage
HdtBasedSink.HDTParser - Class in org.dice_research.squirrel.sink.impl.hdt
 
HDTParser(String, CrawleableUri) - Constructor for class org.dice_research.squirrel.sink.impl.hdt.HdtBasedSink.HDTParser
 
HTML - Static variable in class org.dice_research.squirrel.analyzer.manager.SimpleOrderedAnalyzerManager
Deprecated.
 
HTML_SCRAPER_YAML_PATH - Static variable in class org.dice_research.squirrel.configurator.HtmlScraperConfiguration
 
HtmlScraper - Class in org.dice_research.squirrel.analyzer.impl.html.scraper
HTMLScraper to extract triples from HTML Data based in pre configured yaml files.
HtmlScraper(File) - Constructor for class org.dice_research.squirrel.analyzer.impl.html.scraper.HtmlScraper
 
HtmlScraper() - Constructor for class org.dice_research.squirrel.analyzer.impl.html.scraper.HtmlScraper
 
htmlScraper - Variable in class org.dice_research.squirrel.analyzer.impl.html.scraper.HTMLScraperAnalyzer
 
HTMLScraperAnalyzer - Class in org.dice_research.squirrel.analyzer.impl.html.scraper
Analyzer that scrapes HTML files based on preset setup from yaml files
HTMLScraperAnalyzer(UriCollector, HtmlScraper) - Constructor for class org.dice_research.squirrel.analyzer.impl.html.scraper.HTMLScraperAnalyzer
 
HTMLScraperAnalyzer(UriCollector) - Constructor for class org.dice_research.squirrel.analyzer.impl.html.scraper.HTMLScraperAnalyzer
 
HtmlScraperConfiguration - Class in org.dice_research.squirrel.configurator
 
HtmlScraperConfiguration(String) - Constructor for class org.dice_research.squirrel.configurator.HtmlScraperConfiguration
 
HTTP_RESPONSE_HEADER_PREFIX - Static variable in class org.dice_research.squirrel.fetcher.http.HTTPFetcher
The prefix which is added to HTTP response headers before they are stored the CrawleableUri's data map.
HTTPFetcher - Class in org.dice_research.squirrel.fetcher.http
Fetcher which uses an HTTP client to fetch data and store it in a temporary directory.
HTTPFetcher() - Constructor for class org.dice_research.squirrel.fetcher.http.HTTPFetcher
 
HTTPFetcher(String) - Constructor for class org.dice_research.squirrel.fetcher.http.HTTPFetcher
 
HTTPFetcher(CloseableHttpClient) - Constructor for class org.dice_research.squirrel.fetcher.http.HTTPFetcher
 

I

id - Variable in class org.dice_research.squirrel.worker.impl.WorkerImpl
Deprecated.
increaseTripleCount() - Method in class org.dice_research.squirrel.sink.impl.file.FileBasedSink.StreamStatus
 
init() - Method in class org.dice_research.squirrel.components.WorkerComponent
 
initQueryExecution(String, Delayer) - Method in class org.dice_research.squirrel.fetcher.sparql.SparqlBasedFetcher
 
initQueryExecution(String) - Method in class org.dice_research.squirrel.fetcher.sparql.SparqlDatasetFetcher
 
inputType - Variable in class org.dice_research.squirrel.sink.impl.hdt.HdtBasedSink
input type for parsing the file
INSERT_URI_QUERY_PART_1 - Static variable in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
INSERT_URI_QUERY_PART_2 - Static variable in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
insertStmt - Variable in class org.dice_research.squirrel.collect.SqlBasedUriCollector.UriTableStatus
 
instance - Static variable in class org.dice_research.squirrel.fetcher.delay.DummyDelayer
 
instance - Static variable in class org.dice_research.squirrel.sink.impl.sparql.QueryGenerator
The instance of the class QueryGenerator.
is - Variable in class org.dice_research.squirrel.fetcher.sparql.JsonFetcher
 
isCheckForUriType() - Method in class org.dice_research.squirrel.fetcher.ckan.java.SimpleCkanFetcher
 
isElegible(CrawleableUri, File) - Method in interface org.dice_research.squirrel.analyzer.Analyzer
 
isElegible(CrawleableUri, File) - Method in class org.dice_research.squirrel.analyzer.impl.ckan.CkanJsonAnalyzer
 
isElegible(CrawleableUri, File) - Method in class org.dice_research.squirrel.analyzer.impl.HDTAnalyzer
 
isElegible(CrawleableUri, File) - Method in class org.dice_research.squirrel.analyzer.impl.html.mf.MicroformatsAnalyzer
 
isElegible(CrawleableUri, File) - Method in class org.dice_research.squirrel.analyzer.impl.html.scraper.HTMLScraperAnalyzer
 
isElegible(CrawleableUri, File) - Method in class org.dice_research.squirrel.analyzer.impl.JsonAnalyzer
 
isElegible(CrawleableUri, File) - Method in class org.dice_research.squirrel.analyzer.impl.MicrodataAnalyzer
 
isElegible(CrawleableUri, File) - Method in class org.dice_research.squirrel.analyzer.impl.MicroformatMF2JAnalyzer
 
isElegible(CrawleableUri, File) - Method in class org.dice_research.squirrel.analyzer.impl.RDFaAnalyzer
 
isElegible(CrawleableUri, File) - Method in class org.dice_research.squirrel.analyzer.impl.RDFAnalyzer
 
isElegible(CrawleableUri, File) - Method in class org.dice_research.squirrel.analyzer.manager.SimpleAnalyzerManager
 
isElegible(CrawleableUri, File) - Method in class org.dice_research.squirrel.analyzer.manager.SimpleOrderedAnalyzerManager
Deprecated.
 
isError() - Method in interface org.dice_research.squirrel.analyzer.mime.FiniteStateMachine
 
isError() - Method in class org.dice_research.squirrel.analyzer.mime.RdfAutomata
 
isError - Variable in class org.dice_research.squirrel.analyzer.mime.RtState
 
isError() - Method in class org.dice_research.squirrel.analyzer.mime.RtState
 
isError() - Method in interface org.dice_research.squirrel.analyzer.mime.State
 
isError() - Method in class org.dice_research.squirrel.analyzer.mime.TurtleAutomata
 
isFinal - Variable in class org.dice_research.squirrel.analyzer.mime.RtState
 
isFinal() - Method in class org.dice_research.squirrel.analyzer.mime.RtState
 
isFinal() - Method in interface org.dice_research.squirrel.analyzer.mime.State
Can the automaton stop on this state?
isPossible(String) - Method in class org.dice_research.squirrel.analyzer.mime.RtTransition
 
isPossible(String) - Method in interface org.dice_research.squirrel.analyzer.mime.Transition
Is the transition possible with the given character?
isUriCrawlable(CrawleableUri) - Method in interface org.dice_research.squirrel.robots.RobotsManager
Returns true, if the robots.txt file does not forbid the crawling of that URI.
isUriCrawlable(CrawleableUri) - Method in class org.dice_research.squirrel.robots.RobotsManagerImpl
 

J

jenaContentTypes - Variable in class org.dice_research.squirrel.analyzer.impl.RDFAnalyzer
 
JsonAnalyzer - Class in org.dice_research.squirrel.analyzer.impl
 
JsonAnalyzer(UriCollector) - Constructor for class org.dice_research.squirrel.analyzer.impl.JsonAnalyzer
 
JsonFetcher - Class in org.dice_research.squirrel.fetcher.sparql
 
JsonFetcher() - Constructor for class org.dice_research.squirrel.fetcher.sparql.JsonFetcher
 
jsoupQuery(String) - Method in class org.dice_research.squirrel.analyzer.impl.html.scraper.HtmlScraper
 

K

knownUris - Variable in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 

L

label - Variable in class org.dice_research.squirrel.analyzer.impl.html.scraper.HtmlScraper
 
lastIpAddress - Variable in class org.dice_research.squirrel.robots.RobotsManagerImpl
 
lastRequestTimeStamp - Variable in class org.dice_research.squirrel.fetcher.delay.StaticDelayer
The time stamp (in ms) at which the last request has been finished.
limit - Variable in class org.dice_research.squirrel.fetcher.sparql.SparqlDatasetFetcher
 
LIST_ANALYZERS - Static variable in class org.dice_research.squirrel.analyzer.manager.SimpleAnalyzerManager
 
listDirectory(FTPClient, String, String, int) - Method in class org.dice_research.squirrel.fetcher.ftp.FTPRecursiveFetcher
 
listFilesIn(String) - Static method in class org.dice_research.squirrel.fetcher.utils.ZipArchiver
 
listIterableObjects - Variable in class org.dice_research.squirrel.analyzer.impl.html.scraper.HtmlScraper
 
loadFiles(File) - Method in class org.dice_research.squirrel.analyzer.impl.html.scraper.YamlFilesParser
 
LOGGER - Static variable in class org.dice_research.squirrel.analyzer.commons.SquirrelTripleHandler
 
LOGGER - Static variable in class org.dice_research.squirrel.analyzer.compress.impl.FileManager
 
LOGGER - Static variable in class org.dice_research.squirrel.analyzer.impl.ckan.CkanDatasetConsumer
 
LOGGER - Static variable in class org.dice_research.squirrel.analyzer.impl.ckan.CkanJsonAnalyzer
 
LOGGER - Static variable in class org.dice_research.squirrel.analyzer.impl.HDTAnalyzer
 
LOGGER - Static variable in class org.dice_research.squirrel.analyzer.impl.html.mf.MicroformatsAnalyzer
 
LOGGER - Static variable in class org.dice_research.squirrel.analyzer.impl.html.scraper.HtmlScraper
 
LOGGER - Static variable in class org.dice_research.squirrel.analyzer.impl.html.scraper.HTMLScraperAnalyzer
 
LOGGER - Static variable in class org.dice_research.squirrel.analyzer.impl.html.scraper.YamlFilesParser
 
LOGGER - Static variable in class org.dice_research.squirrel.analyzer.impl.JsonAnalyzer
 
LOGGER - Static variable in class org.dice_research.squirrel.analyzer.impl.MicrodataAnalyzer
 
LOGGER - Static variable in class org.dice_research.squirrel.analyzer.impl.MicroformatMF2JAnalyzer
 
LOGGER - Static variable in class org.dice_research.squirrel.analyzer.impl.RDFaAnalyzer
 
LOGGER - Static variable in class org.dice_research.squirrel.analyzer.impl.RDFAnalyzer
 
LOGGER - Static variable in class org.dice_research.squirrel.analyzer.manager.SimpleAnalyzerManager
 
LOGGER - Static variable in class org.dice_research.squirrel.analyzer.manager.SimpleOrderedAnalyzerManager
Deprecated.
 
LOGGER - Static variable in class org.dice_research.squirrel.analyzer.mime.MimeTypeDetector
 
LOGGER - Static variable in class org.dice_research.squirrel.collect.SimpleUriCollector
 
LOGGER - Static variable in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
LOGGER - Static variable in class org.dice_research.squirrel.components.WorkerComponent
 
LOGGER - Static variable in class org.dice_research.squirrel.components.WorkerComponentStarter
 
LOGGER - Static variable in class org.dice_research.squirrel.configurator.CkanWhiteListConfiguration
 
LOGGER - Static variable in class org.dice_research.squirrel.configurator.HtmlScraperConfiguration
 
LOGGER - Static variable in class org.dice_research.squirrel.configurator.RobotsManagerConfiguration
 
LOGGER - Static variable in class org.dice_research.squirrel.configurator.WorkerConfiguration
 
LOGGER - Static variable in class org.dice_research.squirrel.encoder.TripleEncoder
 
LOGGER - Static variable in class org.dice_research.squirrel.fetcher.ckan.java.PaginatedCkanFetcher
 
LOGGER - Static variable in class org.dice_research.squirrel.fetcher.ckan.java.SimpleCkanFetcher
 
LOGGER - Static variable in class org.dice_research.squirrel.fetcher.deref.DereferencingFetcher
Deprecated.
 
LOGGER - Static variable in class org.dice_research.squirrel.fetcher.dump.DumpFetcher
Deprecated.
 
LOGGER - Static variable in class org.dice_research.squirrel.fetcher.ftp.FTPFetcher
 
LOGGER - Static variable in class org.dice_research.squirrel.fetcher.http.HTTPFetcher
 
LOGGER - Static variable in class org.dice_research.squirrel.fetcher.manage.SimpleOrderedFetcherManager
 
LOGGER - Static variable in class org.dice_research.squirrel.fetcher.sparql.JsonFetcher
 
LOGGER - Static variable in class org.dice_research.squirrel.fetcher.sparql.SparqlBasedFetcher
 
LOGGER - Static variable in class org.dice_research.squirrel.fetcher.sparql.SparqlDatasetFetcher
 
LOGGER - Static variable in class org.dice_research.squirrel.robots.RobotsManagerImpl
 
LOGGER - Static variable in class org.dice_research.squirrel.sink.impl.file.FileBasedSink
 
LOGGER - Static variable in class org.dice_research.squirrel.sink.impl.hdt.HdtBasedSink
 
LOGGER - Static variable in class org.dice_research.squirrel.sink.impl.sparql.AbstractBufferingSink
 
LOGGER - Static variable in class org.dice_research.squirrel.sink.impl.sparql.QueryGenerator
 
LOGGER - Static variable in class org.dice_research.squirrel.sink.impl.sparql.SparqlBasedSink
 
LOGGER - Static variable in class org.dice_research.squirrel.sink.impl.sparql.TDBSink
 
LOGGER - Static variable in class org.dice_research.squirrel.worker.impl.WorkerImpl
 

M

main(String[]) - Static method in class org.dice_research.squirrel.analyzer.impl.RDFaAnalyzer
 
main(String[]) - Static method in class org.dice_research.squirrel.components.WorkerComponentStarter
This is the main method creating and starting an instance of a Component with the given class name.
main(String[]) - Static method in class org.dice_research.squirrel.fetcher.ckan.java.PaginatedCkanFetcher
 
manager - Variable in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
mapper - Variable in class org.dice_research.squirrel.analyzer.impl.ckan.CkanJsonAnalyzer
 
mapper - Variable in class org.dice_research.squirrel.fetcher.ckan.java.SimpleCkanFetcher
 
matchesSerialization(String, String) - Method in class org.dice_research.squirrel.fetcher.dump.DumpFetcher
Deprecated.
 
MAX_ALPHANUM_PART_OF_TABLE_NAME - Static variable in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
MAX_URIS_PER_MESSAGE - Static variable in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
metadataGraphUri - Variable in class org.dice_research.squirrel.sink.impl.sparql.SparqlBasedSink
 
metaDataGraphUri - Variable in class org.dice_research.squirrel.sink.impl.sparql.TDBSink
Uri for the MetaData graph, will be stored in the default graph
MicrodataAnalyzer - Class in org.dice_research.squirrel.analyzer.impl
Any23 Microdata Extractor
MicrodataAnalyzer(UriCollector) - Constructor for class org.dice_research.squirrel.analyzer.impl.MicrodataAnalyzer
 
MicroformatMF2JAnalyzer - Class in org.dice_research.squirrel.analyzer.impl
MF2JParser for Microformats
MicroformatMF2JAnalyzer(UriCollector) - Constructor for class org.dice_research.squirrel.analyzer.impl.MicroformatMF2JAnalyzer
 
MicroformatsAnalyzer - Class in org.dice_research.squirrel.analyzer.impl.html.mf
 
MicroformatsAnalyzer(UriCollector, Mf2Parser) - Constructor for class org.dice_research.squirrel.analyzer.impl.html.mf.MicroformatsAnalyzer
 
MicroformatsAnalyzer(UriCollector) - Constructor for class org.dice_research.squirrel.analyzer.impl.html.mf.MicroformatsAnalyzer
 
MicroServiceBasedCkanFetcher - Class in org.dice_research.squirrel.fetcher.ckan.py
A micro service client which communicates via RabbitMQ and assumes that the micro service is able to fetch the content of a CKAN.
MicroServiceBasedCkanFetcher() - Constructor for class org.dice_research.squirrel.fetcher.ckan.py.MicroServiceBasedCkanFetcher
 
mime_type - Variable in enum org.dice_research.squirrel.analyzer.compress.enums.MimeTypeEnum
 
mime_type() - Method in enum org.dice_research.squirrel.analyzer.compress.enums.MimeTypeEnum
 
mimeType - Variable in class org.dice_research.squirrel.analyzer.mime.RdfAutomata
 
mimeType - Variable in class org.dice_research.squirrel.analyzer.mime.TurtleAutomata
 
MimeTypeDetector - Class in org.dice_research.squirrel.analyzer.mime
 
MimeTypeDetector() - Constructor for class org.dice_research.squirrel.analyzer.mime.MimeTypeDetector
 
MimeTypeEnum - Enum in org.dice_research.squirrel.analyzer.compress.enums
 
MimeTypeEnum(String) - Constructor for enum org.dice_research.squirrel.analyzer.compress.enums.MimeTypeEnum
 
MIN_DELAY_KEY - Static variable in class org.dice_research.squirrel.configurator.RobotsManagerConfiguration
 
minDelay - Variable in class org.dice_research.squirrel.configurator.RobotsManagerConfiguration
 
MINIMUM_DELAY - Static variable in class org.dice_research.squirrel.fetcher.sparql.SparqlBasedFetcher
The default minimum delay that the system will have between sending two queries.
minimumDelay - Variable in class org.dice_research.squirrel.fetcher.sparql.SparqlBasedFetcher
 

N

NEWLINE_CHAR - Static variable in class org.dice_research.squirrel.fetcher.ckan.java.SimpleCkanFetcher
 
next - Variable in class org.dice_research.squirrel.analyzer.mime.RtTransition
 
next() - Method in class org.dice_research.squirrel.fetcher.sparql.SparqlBasedFetcher.SelectedTriplesIterator
 
next() - Method in class org.dice_research.squirrel.fetcher.sparql.SparqlDatasetFetcher.SelectedTriplesIterator
 
numberOfQuads - Variable in class org.dice_research.squirrel.sink.impl.sparql.QuadBuffer
 
numberOfTriples - Variable in class org.dice_research.squirrel.metadata.CrawlingActivity
 
numberOfTriples - Variable in class org.dice_research.squirrel.sink.impl.sparql.TripleBuffer
 

O

openContext(ExtractionContext) - Method in class org.dice_research.squirrel.analyzer.commons.SquirrelTripleHandler
 
openSinkForUri(CrawleableUri) - Method in class org.dice_research.squirrel.collect.SimpleUriCollector
 
openSinkForUri(CrawleableUri) - Method in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
openSinkForUri(CrawleableUri) - Method in class org.dice_research.squirrel.sink.impl.file.FileBasedSink
 
openSinkForUri(CrawleableUri) - Method in class org.dice_research.squirrel.sink.impl.hdt.HdtBasedSink
 
openSinkForUri(CrawleableUri) - Method in class org.dice_research.squirrel.sink.impl.sparql.AbstractBufferingSink
 
org.dice_research.squirrel.analyzer - package org.dice_research.squirrel.analyzer
 
org.dice_research.squirrel.analyzer.commons - package org.dice_research.squirrel.analyzer.commons
 
org.dice_research.squirrel.analyzer.compress - package org.dice_research.squirrel.analyzer.compress
 
org.dice_research.squirrel.analyzer.compress.enums - package org.dice_research.squirrel.analyzer.compress.enums
 
org.dice_research.squirrel.analyzer.compress.impl - package org.dice_research.squirrel.analyzer.compress.impl
 
org.dice_research.squirrel.analyzer.impl - package org.dice_research.squirrel.analyzer.impl
 
org.dice_research.squirrel.analyzer.impl.ckan - package org.dice_research.squirrel.analyzer.impl.ckan
 
org.dice_research.squirrel.analyzer.impl.html.mf - package org.dice_research.squirrel.analyzer.impl.html.mf
 
org.dice_research.squirrel.analyzer.impl.html.scraper - package org.dice_research.squirrel.analyzer.impl.html.scraper
 
org.dice_research.squirrel.analyzer.impl.html.scraper.exceptions - package org.dice_research.squirrel.analyzer.impl.html.scraper.exceptions
 
org.dice_research.squirrel.analyzer.manager - package org.dice_research.squirrel.analyzer.manager
 
org.dice_research.squirrel.analyzer.mime - package org.dice_research.squirrel.analyzer.mime
 
org.dice_research.squirrel.collect - package org.dice_research.squirrel.collect
 
org.dice_research.squirrel.components - package org.dice_research.squirrel.components
 
org.dice_research.squirrel.configurator - package org.dice_research.squirrel.configurator
 
org.dice_research.squirrel.encoder - package org.dice_research.squirrel.encoder
 
org.dice_research.squirrel.fetcher - package org.dice_research.squirrel.fetcher
 
org.dice_research.squirrel.fetcher.ckan.java - package org.dice_research.squirrel.fetcher.ckan.java
 
org.dice_research.squirrel.fetcher.ckan.py - package org.dice_research.squirrel.fetcher.ckan.py
 
org.dice_research.squirrel.fetcher.delay - package org.dice_research.squirrel.fetcher.delay
 
org.dice_research.squirrel.fetcher.deref - package org.dice_research.squirrel.fetcher.deref
 
org.dice_research.squirrel.fetcher.dump - package org.dice_research.squirrel.fetcher.dump
 
org.dice_research.squirrel.fetcher.ftp - package org.dice_research.squirrel.fetcher.ftp
 
org.dice_research.squirrel.fetcher.http - package org.dice_research.squirrel.fetcher.http
 
org.dice_research.squirrel.fetcher.manage - package org.dice_research.squirrel.fetcher.manage
 
org.dice_research.squirrel.fetcher.sparql - package org.dice_research.squirrel.fetcher.sparql
 
org.dice_research.squirrel.fetcher.utils - package org.dice_research.squirrel.fetcher.utils
 
org.dice_research.squirrel.metadata - package org.dice_research.squirrel.metadata
 
org.dice_research.squirrel.robots - package org.dice_research.squirrel.robots
 
org.dice_research.squirrel.sink.impl.file - package org.dice_research.squirrel.sink.impl.file
 
org.dice_research.squirrel.sink.impl.hdt - package org.dice_research.squirrel.sink.impl.hdt
 
org.dice_research.squirrel.sink.impl.sparql - package org.dice_research.squirrel.sink.impl.sparql
 
org.dice_research.squirrel.worker.impl - package org.dice_research.squirrel.worker.impl
 
out - Variable in class org.dice_research.squirrel.sink.impl.hdt.HdtBasedSink
outputstream for Metadata
OUTPUT_FOLDER_KEY - Static variable in class org.dice_research.squirrel.components.WorkerComponent
 
OUTPUT_FOLDER_KEY - Static variable in class org.dice_research.squirrel.configurator.WorkerConfiguration
 
OUTPUT_GRAPH_PROPERTY - Static variable in class org.dice_research.squirrel.analyzer.commons.SquirrelClerezzaSink
 
outputDirectory - Variable in class org.dice_research.squirrel.sink.impl.file.FileBasedSink
Directory to which the files of this sink are written.
outputDirectory - Variable in class org.dice_research.squirrel.sink.impl.file.FileBasedSink.StreamStatus
 
outputDirectory - Variable in class org.dice_research.squirrel.sink.impl.hdt.HdtBasedSink
 
outputFolder - Variable in class org.dice_research.squirrel.configurator.WorkerConfiguration
 
outputLang - Variable in class org.dice_research.squirrel.sink.impl.file.FileBasedSink
Language used for the output files.
outputLang - Variable in class org.dice_research.squirrel.sink.impl.file.FileBasedSink.StreamStatus
 
outputResource - Variable in class org.dice_research.squirrel.metadata.CrawlingActivity
The URIs of the resources generated by this activity as well as their type as RDF Resource.

P

PAGESIZE - Variable in class org.dice_research.squirrel.fetcher.ckan.java.PaginatedCkanFetcher
 
PaginatedCkanFetcher - Class in org.dice_research.squirrel.fetcher.ckan.java
Simple Java-based CKAN Fetcher.
PaginatedCkanFetcher(int) - Constructor for class org.dice_research.squirrel.fetcher.ckan.java.PaginatedCkanFetcher
Time out for the Ckan Client
PaginatedCkanFetcher() - Constructor for class org.dice_research.squirrel.fetcher.ckan.java.PaginatedCkanFetcher
 
parseDataset(String) - Method in class org.dice_research.squirrel.analyzer.impl.ckan.CkanJsonAnalyzer
 
parser - Variable in class org.dice_research.squirrel.analyzer.impl.html.mf.MicroformatsAnalyzer
 
parser - Variable in class org.dice_research.squirrel.robots.RobotsManagerImpl
 
path - Variable in class org.dice_research.squirrel.configurator.HtmlScraperConfiguration
 
path - Variable in class org.dice_research.squirrel.fetcher.ftp.FTPRecursiveFetcher
 
performCrawling(CrawleableUri) - Method in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
prepareMetadataModel() - Method in class org.dice_research.squirrel.metadata.CrawlingActivity
Prepare the metadata model and returns it.
PROPERTY_MAPPING - Static variable in class org.dice_research.squirrel.analyzer.impl.html.mf.MicroformatsAnalyzer
 

Q

quad(Quad) - Method in class org.dice_research.squirrel.analyzer.commons.FilterSinkRDF
 
quadBuffer - Variable in class org.dice_research.squirrel.sink.impl.sparql.AbstractBufferingSink
 
QuadBuffer - Class in org.dice_research.squirrel.sink.impl.sparql
 
QuadBuffer() - Constructor for class org.dice_research.squirrel.sink.impl.sparql.QuadBuffer
 
QuadBuffer(int) - Constructor for class org.dice_research.squirrel.sink.impl.sparql.QuadBuffer
 
queryDatasetURI - Variable in class org.dice_research.squirrel.sink.impl.sparql.TDBSink
The URI of the DB in which querys can be performed.
queryExecFactory - Variable in class org.dice_research.squirrel.sink.impl.sparql.SparqlBasedSink
The Query factory used to query the SPARQL endpoint.
QueryGenerator - Class in org.dice_research.squirrel.sink.impl.sparql
This class is used to provides querys for basic SPARQL commands needed in this project.
QueryGenerator() - Constructor for class org.dice_research.squirrel.sink.impl.sparql.QueryGenerator
 

R

RDF - Static variable in class org.dice_research.squirrel.analyzer.manager.SimpleOrderedAnalyzerManager
Deprecated.
 
RDFaAnalyzer - Class in org.dice_research.squirrel.analyzer.impl
Analyzer to extract RDFa format triples
RDFaAnalyzer(UriCollector) - Constructor for class org.dice_research.squirrel.analyzer.impl.RDFaAnalyzer
 
RDFAnalyzer - Class in org.dice_research.squirrel.analyzer.impl
Analyzer to parse RDF lang types
RDFAnalyzer(UriCollector) - Constructor for class org.dice_research.squirrel.analyzer.impl.RDFAnalyzer
 
RdfAutomata - Class in org.dice_research.squirrel.analyzer.mime
Default implementation of a finite state machine.
RdfAutomata(State) - Constructor for class org.dice_research.squirrel.analyzer.mime.RdfAutomata
Ctor.
rdfInput - Variable in class org.dice_research.squirrel.sink.impl.hdt.HdtBasedSink.HDTParser
 
readAll(Reader) - Method in class org.dice_research.squirrel.fetcher.sparql.JsonFetcher
 
receiveNamespace(String, String, ExtractionContext) - Method in class org.dice_research.squirrel.analyzer.commons.SquirrelTripleHandler
 
receiveTriple(Resource, IRI, Value, IRI, ExtractionContext) - Method in class org.dice_research.squirrel.analyzer.commons.SquirrelTripleHandler
 
recieve() - Static method in class org.dice_research.squirrel.fetcher.ckan.py.MicroServiceBasedCkanFetcher
 
recursiveFetcher - Variable in class org.dice_research.squirrel.fetcher.ftp.FTPFetcher
 
REGEX - Static variable in class org.dice_research.squirrel.analyzer.impl.html.scraper.YamlFileAtributes
 
replaceCommands(String) - Method in class org.dice_research.squirrel.analyzer.impl.html.scraper.HtmlScraper
 
replaceVocab(String) - Static method in class org.dice_research.squirrel.analyzer.impl.html.mf.MicroformatsAnalyzer
 
replaceVocab(String) - Static method in class org.dice_research.squirrel.analyzer.impl.MicroformatMF2JAnalyzer
 
REQUEST_ACCEPT_HEADER_VALUE - Static variable in class org.dice_research.squirrel.fetcher.deref.DereferencingFetcher
Deprecated.
 
requestData(CrawleableUri, File) - Method in class org.dice_research.squirrel.fetcher.ftp.FTPFetcher
 
requestData(CrawleableUri, File) - Method in class org.dice_research.squirrel.fetcher.http.HTTPFetcher
 
requestFinished() - Method in interface org.dice_research.squirrel.fetcher.delay.Delayer
informs the Delayer instance about the finished request.
requestFinished() - Method in class org.dice_research.squirrel.fetcher.delay.DummyDelayer
 
requestFinished() - Method in class org.dice_research.squirrel.fetcher.delay.StaticDelayer
 
requestModel(URI) - Method in class org.dice_research.squirrel.fetcher.deref.DereferencingFetcher
Deprecated.
 
RESOURCES - Static variable in class org.dice_research.squirrel.analyzer.impl.html.scraper.YamlFileAtributes
 
resultSet - Variable in class org.dice_research.squirrel.fetcher.sparql.SparqlBasedFetcher.SelectedTriplesIterator
 
robotRules - Variable in class org.dice_research.squirrel.robots.RobotsManagerImpl
 
ROBOTS_FILE_NAME - Static variable in class org.dice_research.squirrel.robots.RobotsManagerImpl
 
RobotsManager - Interface in org.dice_research.squirrel.robots
Interface of a class that can be used to access the robots.txt files.
RobotsManagerConfiguration - Class in org.dice_research.squirrel.configurator
 
RobotsManagerConfiguration(Long) - Constructor for class org.dice_research.squirrel.configurator.RobotsManagerConfiguration
 
RobotsManagerImpl - Class in org.dice_research.squirrel.robots
 
RobotsManagerImpl(BaseHttpFetcher) - Constructor for class org.dice_research.squirrel.robots.RobotsManagerImpl
 
RobotsManagerImpl(BaseHttpFetcher, BaseRobotsParser) - Constructor for class org.dice_research.squirrel.robots.RobotsManagerImpl
 
RtState - Class in org.dice_research.squirrel.analyzer.mime
State in a finite state machine.
RtState() - Constructor for class org.dice_research.squirrel.analyzer.mime.RtState
 
RtState(boolean, boolean) - Constructor for class org.dice_research.squirrel.analyzer.mime.RtState
 
RtTransition - Class in org.dice_research.squirrel.analyzer.mime
Transition in finite state machine.
RtTransition(String, State) - Constructor for class org.dice_research.squirrel.analyzer.mime.RtTransition
Ctor.
rule - Variable in class org.dice_research.squirrel.analyzer.mime.RtTransition
 
run() - Method in class org.dice_research.squirrel.components.WorkerComponent
 
run() - Method in class org.dice_research.squirrel.components.WorkerComponentConfig
 
run() - Method in class org.dice_research.squirrel.sink.impl.hdt.HdtBasedSink.HDTParser
 
run() - Method in class org.dice_research.squirrel.worker.impl.WorkerImpl
 

S

scrape(String, File) - Method in class org.dice_research.squirrel.analyzer.impl.html.scraper.HtmlScraper
 
scrapeDownloadLink(Map<String, Object>, File) - Method in class org.dice_research.squirrel.analyzer.impl.html.scraper.HtmlScraper
 
scrapeTree(Map<String, Object>, Set<Triple>, Stack<Node>) - Method in class org.dice_research.squirrel.analyzer.impl.html.scraper.HtmlScraper
 
SEARCH_CHECK - Static variable in class org.dice_research.squirrel.analyzer.impl.html.scraper.YamlFileAtributes
 
SEARCH_DOMAIN - Static variable in class org.dice_research.squirrel.analyzer.impl.html.scraper.YamlFileAtributes
 
SELECT_ALL_TRIPLES_QUERY - Static variable in class org.dice_research.squirrel.fetcher.sparql.SparqlBasedFetcher
 
SELECT_TABLE_QUERY - Static variable in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
selectedMap - Variable in class org.dice_research.squirrel.analyzer.impl.html.scraper.HtmlScraper
 
SelectedTriplesIterator(ResultSet) - Constructor for class org.dice_research.squirrel.fetcher.sparql.SparqlBasedFetcher.SelectedTriplesIterator
 
SelectedTriplesIterator(Iterator<Triple>) - Constructor for class org.dice_research.squirrel.fetcher.sparql.SparqlDatasetFetcher.SelectedTriplesIterator
 
send(String) - Static method in class org.dice_research.squirrel.fetcher.ckan.py.MicroServiceBasedCkanFetcher
 
sendAliveMessages - Variable in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
sender - Variable in class org.dice_research.squirrel.components.WorkerComponent
 
sender() - Method in class org.dice_research.squirrel.components.WorkerComponentConfig
 
senderDeduplicator - Variable in class org.dice_research.squirrel.components.WorkerComponent
 
senderFrontier - Variable in class org.dice_research.squirrel.components.WorkerComponent
 
sendNewUris(Iterator<byte[]>) - Method in class org.dice_research.squirrel.worker.impl.WorkerImpl
Sends the given URIs to the frontier.
sendQuads(CrawleableUri, Collection<Quad>) - Method in class org.dice_research.squirrel.sink.impl.sparql.AbstractBufferingSink
 
sendQuads(AbstractBufferingSink, CrawleableUri) - Method in class org.dice_research.squirrel.sink.impl.sparql.QuadBuffer
 
sendQuads(CrawleableUri, Collection<Quad>) - Method in class org.dice_research.squirrel.sink.impl.sparql.SparqlBasedSink
Method to send all buffered quads to the database
sendQuads(CrawleableUri, Collection<Quad>) - Method in class org.dice_research.squirrel.sink.impl.sparql.TDBSink
 
sendsAliveMessages() - Method in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
sendTriples(CrawleableUri, Collection<Triple>) - Method in class org.dice_research.squirrel.sink.impl.sparql.AbstractBufferingSink
 
sendTriples(CrawleableUri, Collection<Triple>) - Method in class org.dice_research.squirrel.sink.impl.sparql.SparqlBasedSink
Method to send all buffered triples to the database
sendTriples(CrawleableUri, Collection<Triple>) - Method in class org.dice_research.squirrel.sink.impl.sparql.TDBSink
Method to send all buffered triples to the database
sendTriples(AbstractBufferingSink, CrawleableUri) - Method in class org.dice_research.squirrel.sink.impl.sparql.TripleBuffer
 
serializationDetector - Variable in class org.dice_research.squirrel.analyzer.impl.RDFAnalyzer
 
serializer - Variable in class org.dice_research.squirrel.collect.SimpleUriCollector
Serializer used to serialize the given URIs.
serializer - Variable in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
serializer - Variable in class org.dice_research.squirrel.components.WorkerComponent
 
serializer - Variable in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
serialVersionUID - Static variable in exception org.dice_research.squirrel.analyzer.impl.html.scraper.exceptions.ElementNotFoundException
 
serialVersionUID - Static variable in exception org.dice_research.squirrel.analyzer.impl.html.scraper.exceptions.SyntaxParserException
 
serialVersionUID - Static variable in class org.dice_research.squirrel.metadata.CrawlingActivity
 
setAcceptCharset(String) - Method in class org.dice_research.squirrel.fetcher.http.HTTPFetcher
 
setAcceptHeader(String) - Method in class org.dice_research.squirrel.fetcher.http.HTTPFetcher
The value of the HTTP Accept header field that is used if the given CrawleableUri instance does not define this.
setBaseUri(String) - Method in class org.dice_research.squirrel.analyzer.commons.SquirrelClerezzaSink
 
setCheckForUriType(boolean) - Method in class org.dice_research.squirrel.fetcher.ckan.java.SimpleCkanFetcher
 
setContentLength(long) - Method in class org.dice_research.squirrel.analyzer.commons.SquirrelTripleHandler
 
setDataDirectory(File) - Method in class org.dice_research.squirrel.fetcher.ckan.java.SimpleCkanFetcher
 
setDataDirectory(File) - Method in class org.dice_research.squirrel.fetcher.http.HTTPFetcher
 
setDefaultMinWaitingTime(long) - Method in class org.dice_research.squirrel.robots.RobotsManagerImpl
 
setFetchers(Fetcher...) - Method in class org.dice_research.squirrel.fetcher.manage.SimpleOrderedFetcherManager
 
setMetadataGraphUri(CrawleableUri) - Method in class org.dice_research.squirrel.sink.impl.sparql.SparqlBasedSink
 
setNumberOfTriples(long) - Method in class org.dice_research.squirrel.metadata.CrawlingActivity
 
setProperty(String, Object) - Method in class org.dice_research.squirrel.analyzer.commons.SquirrelClerezzaSink
 
setSearch(Map<String, Map<String, Object>>) - Method in class org.dice_research.squirrel.analyzer.impl.html.scraper.YamlFile
 
setSpecificRecrawlTime(CrawleableUri) - Method in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
setState(CrawlingActivity.CrawlingURIState) - Method in class org.dice_research.squirrel.metadata.CrawlingActivity
 
setTerminateFlag(boolean) - Method in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
setWorker(Worker) - Method in class org.dice_research.squirrel.components.WorkerComponent
 
SevenZipDecompressor - Class in org.dice_research.squirrel.analyzer.compress.impl
 
SevenZipDecompressor() - Constructor for class org.dice_research.squirrel.analyzer.compress.impl.SevenZipDecompressor
 
SimpleAnalyzerManager - Class in org.dice_research.squirrel.analyzer.manager
Class responsible for managing analyzers injected by the Spring Context
SimpleAnalyzerManager(UriCollector, String...) - Constructor for class org.dice_research.squirrel.analyzer.manager.SimpleAnalyzerManager
Deprecated.
SimpleAnalyzerManager(List<AbstractAnalyzer>) - Constructor for class org.dice_research.squirrel.analyzer.manager.SimpleAnalyzerManager
Receives the array of String injected by Spring and crates new instances of classes that implements the Analyzer interface
SimpleCkanFetcher - Class in org.dice_research.squirrel.fetcher.ckan.java
Simple Java-based CKAN Fetcher.
SimpleCkanFetcher() - Constructor for class org.dice_research.squirrel.fetcher.ckan.java.SimpleCkanFetcher
 
SimpleOrderedAnalyzerManager - Class in org.dice_research.squirrel.analyzer.manager
Deprecated.
SimpleOrderedAnalyzerManager(UriCollector) - Constructor for class org.dice_research.squirrel.analyzer.manager.SimpleOrderedAnalyzerManager
Deprecated.
 
SimpleOrderedFetcherManager - Class in org.dice_research.squirrel.fetcher.manage
A very simple manager for Fetcher instances that is based on the order of the given fetchers.
SimpleOrderedFetcherManager(Fetcher...) - Constructor for class org.dice_research.squirrel.fetcher.manage.SimpleOrderedFetcherManager
 
SimpleUriCollector - Class in org.dice_research.squirrel.collect
Simple in-memory implementation of the UriCollector interface based on the given Serializer.
SimpleUriCollector(Serializer) - Constructor for class org.dice_research.squirrel.collect.SimpleUriCollector
Constructor.
sink - Variable in class org.dice_research.squirrel.analyzer.commons.FilterSinkRDF
 
sink - Variable in class org.dice_research.squirrel.analyzer.commons.SquirrelClerezzaSink
 
sink - Variable in class org.dice_research.squirrel.analyzer.commons.SquirrelTripleHandler
 
sink - Variable in class org.dice_research.squirrel.analyzer.impl.ckan.CkanDatasetConsumer
 
sink - Variable in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
SPARQL_HOST_CONTAINER_NAME_KEY - Static variable in class org.dice_research.squirrel.configurator.WorkerConfiguration
 
SPARQL_HOST_PORTS_KEY - Static variable in class org.dice_research.squirrel.configurator.WorkerConfiguration
 
SparqlBasedFetcher - Class in org.dice_research.squirrel.fetcher.sparql
A simple Fetcher for SPARQL that tries to get triples from a SPARQL endpoint using the query "SELECT ?s ?p ?o\r\nWHERE {\r\nGRAPH ?g {\r\n?s ?p ?o\r\n}} ".
SparqlBasedFetcher() - Constructor for class org.dice_research.squirrel.fetcher.sparql.SparqlBasedFetcher
 
SparqlBasedFetcher.SelectedTriplesIterator - Class in org.dice_research.squirrel.fetcher.sparql
 
SparqlBasedSink - Class in org.dice_research.squirrel.sink.impl.sparql
A sink which stores the data in different graphs in a sparql based db.
SparqlBasedSink(QueryExecutionFactory, UpdateExecutionFactory, int, int) - Constructor for class org.dice_research.squirrel.sink.impl.sparql.SparqlBasedSink
 
SparqlBasedSink(QueryExecutionFactory, UpdateExecutionFactory) - Constructor for class org.dice_research.squirrel.sink.impl.sparql.SparqlBasedSink
 
SparqlDatasetFetcher - Class in org.dice_research.squirrel.fetcher.sparql
A simple Fetcher for SPARQL that tries to get DataSets from a SPARQL endpoint using the query .
SparqlDatasetFetcher() - Constructor for class org.dice_research.squirrel.fetcher.sparql.SparqlDatasetFetcher
 
SparqlDatasetFetcher(int, int, int) - Constructor for class org.dice_research.squirrel.fetcher.sparql.SparqlDatasetFetcher
 
SparqlDatasetFetcher(int) - Constructor for class org.dice_research.squirrel.fetcher.sparql.SparqlDatasetFetcher
 
SparqlDatasetFetcher.SelectedTriplesIterator - Class in org.dice_research.squirrel.fetcher.sparql
 
sparqlHost - Variable in class org.dice_research.squirrel.configurator.WorkerConfiguration
 
sqarqlPort - Variable in class org.dice_research.squirrel.configurator.WorkerConfiguration
 
SqlBasedUriCollector - Class in org.dice_research.squirrel.collect
An implementation of the UriCollector interface that is backed by a SQL database.
SqlBasedUriCollector(Serializer, String) - Constructor for class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
SqlBasedUriCollector.UriTableStatus - Class in org.dice_research.squirrel.collect
 
SquirrelClerezzaSink - Class in org.dice_research.squirrel.analyzer.commons
TripleSink for Clerezza
SquirrelClerezzaSink(CrawleableUri, UriCollector, Sink) - Constructor for class org.dice_research.squirrel.analyzer.commons.SquirrelClerezzaSink
 
SquirrelTripleHandler - Class in org.dice_research.squirrel.analyzer.commons
Triple handler for Clerezza
SquirrelTripleHandler(CrawleableUri, UriCollector, Sink) - Constructor for class org.dice_research.squirrel.analyzer.commons.SquirrelTripleHandler
 
startDocument(IRI) - Method in class org.dice_research.squirrel.analyzer.commons.SquirrelTripleHandler
 
startStream() - Method in class org.dice_research.squirrel.analyzer.commons.SquirrelClerezzaSink
 
state() - Method in class org.dice_research.squirrel.analyzer.mime.RtTransition
 
State - Interface in org.dice_research.squirrel.analyzer.mime
State.
state() - Method in interface org.dice_research.squirrel.analyzer.mime.Transition
The state to which this transition leads.
state - Variable in class org.dice_research.squirrel.metadata.CrawlingActivity
The crawling state of the uri.
StaticDelayer - Class in org.dice_research.squirrel.fetcher.delay
A simple implementation of the Delayer interface using a static delay between requests.
StaticDelayer(long) - Constructor for class org.dice_research.squirrel.fetcher.delay.StaticDelayer
Constructor.
StaticDelayer(long, long) - Constructor for class org.dice_research.squirrel.fetcher.delay.StaticDelayer
Constructor.
staticMap - Variable in class org.dice_research.squirrel.analyzer.impl.html.scraper.HtmlScraper
 
steps - Variable in class org.dice_research.squirrel.metadata.CrawlingActivity
 
store(Resource, Property, RDFNode) - Method in class org.dice_research.squirrel.analyzer.impl.ckan.CkanDatasetConsumer
 
storeContact(Resource, CkanDataset, Map<String, String>) - Method in class org.dice_research.squirrel.analyzer.impl.ckan.CkanDatasetConsumer
 
storeMetadata - Variable in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
storePublisher(CkanDataset, Resource, Map<String, String>) - Method in class org.dice_research.squirrel.analyzer.impl.ckan.CkanDatasetConsumer
 
storeResource(Resource, CkanResource) - Method in class org.dice_research.squirrel.analyzer.impl.ckan.CkanDatasetConsumer
 
storeResourceOrText(Resource, Property, boolean, String...) - Method in class org.dice_research.squirrel.analyzer.impl.ckan.CkanDatasetConsumer
 
storeResources(Resource, CkanDataset) - Method in class org.dice_research.squirrel.analyzer.impl.ckan.CkanDatasetConsumer
 
storeTextLiteral(Resource, Property, boolean, String...) - Method in class org.dice_research.squirrel.analyzer.impl.ckan.CkanDatasetConsumer
Method creating one or several literals based on the given data and storing it as object of the given subject and predicate.
storeTypedLiteral(Resource, Property, RDFDatatype, boolean, Object...) - Method in class org.dice_research.squirrel.analyzer.impl.ckan.CkanDatasetConsumer
 
streamMapping - Variable in class org.dice_research.squirrel.sink.impl.file.FileBasedSink
Synchronized mapping of crawled URIs to their output stream.
StreamStatus(CrawleableUri, File, Lang, boolean) - Constructor for class org.dice_research.squirrel.sink.impl.file.FileBasedSink.StreamStatus
 
switchState(String) - Method in interface org.dice_research.squirrel.analyzer.mime.FiniteStateMachine
Follow a transition, switch the state of the machine.
switchState(String) - Method in class org.dice_research.squirrel.analyzer.mime.RdfAutomata
 
switchState(String) - Method in class org.dice_research.squirrel.analyzer.mime.TurtleAutomata
 
SyntaxParserException - Exception in org.dice_research.squirrel.analyzer.impl.html.scraper.exceptions
 
SyntaxParserException(String) - Constructor for exception org.dice_research.squirrel.analyzer.impl.html.scraper.exceptions.SyntaxParserException
 

T

TABLE_NAME_GENERATE_REGEX - Static variable in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
TABLE_NAME_KEY - Static variable in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
tableName - Variable in class org.dice_research.squirrel.collect.SqlBasedUriCollector.UriTableStatus
 
TarDecompressor - Class in org.dice_research.squirrel.analyzer.compress.impl
Decompression implementation for the *.tar format
TarDecompressor() - Constructor for class org.dice_research.squirrel.analyzer.compress.impl.TarDecompressor
 
TDBSink - Class in org.dice_research.squirrel.sink.impl.sparql
A sink which stores the data in different graphs in a sparql based db.
TDBSink(String, String, String, String, String, String) - Constructor for class org.dice_research.squirrel.sink.impl.sparql.TDBSink
Constructor of SparqlBasedSink.
terminateFlag - Variable in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
timeout - Variable in class org.dice_research.squirrel.fetcher.ckan.java.PaginatedCkanFetcher
 
timerAliveMessages - Variable in class org.dice_research.squirrel.components.WorkerComponent
 
total_uris - Variable in class org.dice_research.squirrel.collect.SimpleUriCollector
Number of all URIs collected.
total_uris - Variable in class org.dice_research.squirrel.collect.SqlBasedUriCollector
 
transit(String) - Method in class org.dice_research.squirrel.analyzer.mime.RtState
 
transit(String) - Method in interface org.dice_research.squirrel.analyzer.mime.State
Follow one of the transitions, to get to the next state.
Transition - Interface in org.dice_research.squirrel.analyzer.mime
Transition in a finite State machine.
transitions - Variable in class org.dice_research.squirrel.analyzer.mime.RtState
 
triple(Triple) - Method in class org.dice_research.squirrel.analyzer.commons.FilterSinkRDF
 
tripleBuffer - Variable in class org.dice_research.squirrel.sink.impl.sparql.AbstractBufferingSink
The data structure (map) in which the triples are buffered.
TripleBuffer - Class in org.dice_research.squirrel.sink.impl.sparql
 
TripleBuffer() - Constructor for class org.dice_research.squirrel.sink.impl.sparql.TripleBuffer
 
TripleBuffer(int) - Constructor for class org.dice_research.squirrel.sink.impl.sparql.TripleBuffer
 
tripleComparator - Static variable in class org.dice_research.squirrel.analyzer.impl.html.scraper.HtmlScraper
 
tripleCount - Variable in class org.dice_research.squirrel.sink.impl.file.FileBasedSink.StreamStatus
 
tripleEncoder - Variable in class org.dice_research.squirrel.analyzer.AbstractAnalyzer
 
tripleEncoder - Variable in class org.dice_research.squirrel.analyzer.impl.ckan.CkanDatasetConsumer
 
TripleEncoder - Class in org.dice_research.squirrel.encoder
Class that can encode triples The encodeTriple will escape the triple's resource and object following Jena escaping rules present on NodeFactory
TripleEncoder() - Constructor for class org.dice_research.squirrel.encoder.TripleEncoder
 
tripleEncoder - Static variable in class org.dice_research.squirrel.encoder.TripleEncoder
 
tripleOutputStream - Variable in class org.dice_research.squirrel.sink.impl.file.FileBasedSink.StreamStatus
 
triples - Variable in class org.dice_research.squirrel.fetcher.sparql.SparqlDatasetFetcher.SelectedTriplesIterator
 
tripleStream - Variable in class org.dice_research.squirrel.sink.impl.file.FileBasedSink.StreamStatus
 
TurtleAutomata - Class in org.dice_research.squirrel.analyzer.mime
Default implementation of a finite state machine.
TurtleAutomata(State) - Constructor for class org.dice_research.squirrel.analyzer.mime.TurtleAutomata
Ctor.
TypeDetector - Interface in org.dice_research.squirrel.analyzer.mime
This interface defines the functionality to detect the mime-types, especially of RDF serializations

U

unzip(String, String, String) - Static method in class org.dice_research.squirrel.fetcher.utils.ZipArchiver
 
updateDatasetURI - Variable in class org.dice_research.squirrel.sink.impl.sparql.TDBSink
The URI of the DB in which updates can be performed.
updatedObjects - Variable in class org.dice_research.squirrel.analyzer.impl.html.scraper.HtmlScraper
 
updateExecFactory - Variable in class org.dice_research.squirrel.sink.impl.sparql.SparqlBasedSink
 
updateMetaDataUri - Variable in class org.dice_research.squirrel.sink.impl.sparql.TDBSink
The URI to the metadata DB in which updates can be performed.
updateRelationship(List<Triple>) - Method in class org.dice_research.squirrel.analyzer.impl.html.scraper.HtmlScraper
Update the triples with nested objects
uri - Variable in class org.dice_research.squirrel.analyzer.impl.html.scraper.HtmlScraper
 
uri - Variable in enum org.dice_research.squirrel.metadata.CrawlingActivity.CrawlingURIState
 
uri - Variable in class org.dice_research.squirrel.metadata.CrawlingActivity
The uri for the crawling activity.
uri - Variable in class org.dice_research.squirrel.sink.impl.file.FileBasedSink.StreamStatus
 
uri - Variable in class org.dice_research.squirrel.sink.impl.hdt.HdtBasedSink.HDTParser
 
uri - Variable in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
UriCollector - Interface in org.dice_research.squirrel.collect
A URI collector stores the URIs that have been found by a worker while crawling/processing a certain URI.
uriProcessor - Variable in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
uriSetRequest - Variable in class org.dice_research.squirrel.components.WorkerComponent
 
urisOfUris - Variable in class org.dice_research.squirrel.collect.SimpleUriCollector
Mapping from URIs to the new URIs that have been found.
uriString - Variable in class org.dice_research.squirrel.sink.impl.file.FileBasedSink.StreamStatus
 
UriTableStatus(String, PreparedStatement, int) - Constructor for class org.dice_research.squirrel.collect.SqlBasedUriCollector.UriTableStatus
 
useCompression - Variable in class org.dice_research.squirrel.sink.impl.file.FileBasedSink.StreamStatus
 
useCompression - Variable in class org.dice_research.squirrel.sink.impl.file.FileBasedSink
Flag whether a compression algorithm should be used.
USER_AGENT - Static variable in class org.dice_research.squirrel.fetcher.deref.DereferencingFetcher
Deprecated.
 

V

valueOf(String) - Static method in enum org.dice_research.squirrel.analyzer.compress.enums.MimeTypeEnum
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.dice_research.squirrel.metadata.CrawlingActivity.CrawlingURIState
Returns the enum constant of this type with the specified name.
values() - Static method in enum org.dice_research.squirrel.analyzer.compress.enums.MimeTypeEnum
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.dice_research.squirrel.metadata.CrawlingActivity.CrawlingURIState
Returns an array containing the constants of this enum type, in the order they are declared.

W

waitingTime - Variable in class org.dice_research.squirrel.worker.impl.WorkerImpl
 
with(Transition) - Method in class org.dice_research.squirrel.analyzer.mime.RtState
 
with(Transition) - Method in interface org.dice_research.squirrel.analyzer.mime.State
Add a Transition to this state.
worker - Variable in class org.dice_research.squirrel.components.WorkerComponent
 
WorkerComponent - Class in org.dice_research.squirrel.components
 
WorkerComponent() - Constructor for class org.dice_research.squirrel.components.WorkerComponent
 
WorkerComponentConfig - Class in org.dice_research.squirrel.components
 
WorkerComponentConfig() - Constructor for class org.dice_research.squirrel.components.WorkerComponentConfig
 
WorkerComponentStarter - Class in org.dice_research.squirrel.components
 
WorkerComponentStarter() - Constructor for class org.dice_research.squirrel.components.WorkerComponentStarter
 
WorkerConfiguration - Class in org.dice_research.squirrel.configurator
 
WorkerConfiguration(String, String, String) - Constructor for class org.dice_research.squirrel.configurator.WorkerConfiguration
 
WorkerImpl - Class in org.dice_research.squirrel.worker.impl
Standard implementation of the Worker interface.
WorkerImpl(Frontier, Fetcher, Sink, Analyzer, RobotsManager, Serializer, UriCollector, long, String, boolean, boolean) - Constructor for class org.dice_research.squirrel.worker.impl.WorkerImpl
Constructor.
workerUri - Variable in class org.dice_research.squirrel.metadata.CrawlingActivity
URI of the worker assigned carrying out this activity.

Y

YamlFile - Class in org.dice_research.squirrel.analyzer.impl.html.scraper
 
YamlFile() - Constructor for class org.dice_research.squirrel.analyzer.impl.html.scraper.YamlFile
 
YamlFileAtributes - Class in org.dice_research.squirrel.analyzer.impl.html.scraper
 
YamlFileAtributes() - Constructor for class org.dice_research.squirrel.analyzer.impl.html.scraper.YamlFileAtributes
 
yamlFiles - Variable in class org.dice_research.squirrel.analyzer.impl.html.scraper.HtmlScraper
 
YamlFilesParser - Class in org.dice_research.squirrel.analyzer.impl.html.scraper
Parser for Yaml Files, for the HTML Scraper
YamlFilesParser(File) - Constructor for class org.dice_research.squirrel.analyzer.impl.html.scraper.YamlFilesParser
 
YamlFilesParser() - Constructor for class org.dice_research.squirrel.analyzer.impl.html.scraper.YamlFilesParser
 
yfs - Variable in class org.dice_research.squirrel.analyzer.impl.html.scraper.YamlFilesParser
 

Z

ZipArchiver - Class in org.dice_research.squirrel.fetcher.utils
Created by ivan on 8/11/16.
ZipArchiver() - Constructor for class org.dice_research.squirrel.fetcher.utils.ZipArchiver
 
ZipDecompressor - Class in org.dice_research.squirrel.analyzer.compress.impl
Decompression implementation for the BZip format
ZipDecompressor() - Constructor for class org.dice_research.squirrel.analyzer.compress.impl.ZipDecompressor
 
A B C D E F G H I J K L M N O P Q R S T U V W Y Z 
Skip navigation links

Copyright © 2017–2020. All rights reserved.