alltypes - deduper

All Types

org.bradfordmiller.deduper.consumers.BaseConsumer

Base definition of a runnable Consumer. Consumers are responsible for persisting data to disk

org.bradfordmiller.deduper.config.Config

Configuration for a deduper process

org.bradfordmiller.deduper.csv.CsvConfigParser

Parses config settings which define the format for outputting csv data. Note that the default file extension for a csv output is 'txt' and the default delimiter is a comma

org.bradfordmiller.deduper.persistors.CsvDupePersistor

creates and writes out duplicate data to csv target. duplicate target is configured in config

org.bradfordmiller.deduper.persistors.CsvHashPersistor

creates and writes out hash values found in a deduper process to a csv defined in config

org.bradfordmiller.deduper.jndi.CsvJNDITargetType

Defines output information for csv target data based on the jndi name and jndi context

org.bradfordmiller.deduper.persistors.CsvPersistor

parser for jndi entries which are configured for csv output. Parses the values found in config

org.bradfordmiller.deduper.persistors.CsvTargetPersistor

create and writes out "deduped" data to csv target. target is configured in config

org.bradfordmiller.deduper.Deduper

dedupes data based on config settings

org.bradfordmiller.deduper.consumers.DeduperDataConsumer

Consumer for processing and persisting target data, IE "deduped" data

org.bradfordmiller.deduper.consumers.DeduperDupeConsumer

Consumer for processing and persisting duplicate data

org.bradfordmiller.deduper.DedupeReport

summary of a dedupe operation, with total recordCount, hashColumns used, columnsFound in the actual source query, dupeCount, distinctDupeCount, and dupes found in the dedupe process.

org.bradfordmiller.deduper.consumers.DeduperHashConsumer

Consumer for processing and persisting MD-5 hashes data

org.bradfordmiller.deduper.DeduperProducer

org.bradfordmiller.deduper.persistors.Dupe

represents a simple duplicate value found by deduper.

org.bradfordmiller.deduper.persistors.DupePersistor

definition for writing out duplicate data to a target flat file or sql table

org.bradfordmiller.deduper.config.ExecutionServiceTimeout

Settings for the Execution Service timeout. Once a deduper has published all data to the blocking queues of all consumers, consumers will have a dynamic timeout set (defaults to 60 seconds) for all consumers to finish persisting data

org.bradfordmiller.deduper.utils.FileUtils

A utility library for file operations

org.bradfordmiller.deduper.hashing.Hasher

Utility class for hashing methods

org.bradfordmiller.deduper.persistors.HashPersistor

definition for writing out hash values of rows found in source data

org.bradfordmiller.deduper.persistors.HashRow

represents hashed data created by deduper

org.bradfordmiller.deduper.config.HashSourceJndi

A hash source jndi entity. This is used when configuring a specific set of existing hashes to "dedupe" against

org.bradfordmiller.deduper.jndi.JNDITargetType

Defines output information for target data based on the jndi name and jndi context

org.bradfordmiller.deduper.SampleRow

reprsentation of a sample of data showing the comma-delimited sampleString and the associated sampleHash for that sample string

org.bradfordmiller.deduper.config.SourceJndi

A source jndi entity

org.bradfordmiller.deduper.persistors.SqlDupePersistor

creates a sql table for persisting duplicate data. This is configured using the dupesJndi contained in the associated context

org.bradfordmiller.deduper.persistors.SqlHashPersistor

creates a sql table for persisting hashed data rows. This is configured using the hashJndi contained in the associated context

org.bradfordmiller.deduper.jndi.SqlJNDIDupeType

Defines output information for sql duplicate data based on the jndi name and jndi context

org.bradfordmiller.deduper.jndi.SqlJNDIHashType

Defines output information for sql hash data based on the jndi name and jndi context

org.bradfordmiller.deduper.jndi.SqlJNDITargetType

Defines output information for sql target data based on the jndi name and jndi context

org.bradfordmiller.deduper.persistors.SqlTargetPersistor

create and writes out "deduped" data to a sql table. targetName is the table name in the javax.sql.DataSource configured in the targetJndi for the associated context. varcharPadding is a number of extra bytes which can be configured if the target needs larger varchar fields than were extracted by the source.

org.bradfordmiller.deduper.persistors.TargetPersistor

definition for writing out deduped data to a target flat file or sql table

org.bradfordmiller.deduper.persistors.WritePersistor

base definition for writing out output data