Class Dataset
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionfinal Datasetbatch(long batchSize) Groups elements of this dataset into batches.final Datasetbatch(long batchSize, boolean dropLastBatch) Groups elements of this dataset into batches.static DatasetCreates an in-memory `Dataset` whose elements are slices of the given tensors.Gets the TensorFlow Ops instance for this datasetGets a list of shapes for each component of this dataset.Gets a list of output types for each component of this dataset.Operand<?> Gets the variant tensor representing this dataset.iterator()Creates an iterator which iterates through all batches of this Dataset in an eager fashion.Creates a `DatasetIterator` that can be used to iterate over elements of this dataset.Creates a `DatasetIterator` that can be used to iterate over elements of this dataset.Returns a new Dataset which maps a function over all elements returned by this dataset.mapAllComponents(Function<Operand<?>, Operand<?>> mapper) Returns a new Dataset which maps a function across all elements from this dataset, on all components of each element.mapOneComponent(int index, Function<Operand<?>, Operand<?>> mapper) Returns a new Dataset which maps a function across all elements from this dataset, on a single component of each element.final Datasetskip(long count) Returns a new `Dataset` which skips `count` initial elements from this datasetfinal Datasettake(long count) Returns a new `Dataset` with only the first `count` elements from this dataset.static DatasettextLineDataset(Ops tf, String filename, String compressionType, long bufferSize) Creates a TextLineDataset from a file containing one recored per ling.static DatasettfRecordDataset(Ops tf, String filename, String compressionType, long bufferSize) Creates a TFRecordDataset from a file containing TFRecordstoString()Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitMethods inherited from interface Iterable
forEach, spliterator
-
Field Details
-
tf
-
-
Constructor Details
-
Dataset
public Dataset(Ops tf, Operand<?> variant, List<Class<? extends TType>> outputTypes, List<Shape> outputShapes) Creates a Dataset- Parameters:
tf- the TensorFlow Opsvariant- the tensor that represents the dataset.outputTypes- a list of output types produced by this data set.outputShapes- a list of output shapes produced by this data set.
-
Dataset
Creates a Dataset that is a copy of another Dataset- Parameters:
other- the other Dataset
-
-
Method Details
-
batch
Groups elements of this dataset into batches.- Parameters:
batchSize- The number of desired elements per batchdropLastBatch- Whether to leave out the final batch if it has fewer than `batchSize` elements.- Returns:
- A batched Dataset
-
batch
Groups elements of this dataset into batches. Includes the last batch, even if it has fewer than `batchSize` elements.- Parameters:
batchSize- The number of desired elements per batch- Returns:
- A batched Dataset
-
skip
Returns a new `Dataset` which skips `count` initial elements from this dataset- Parameters:
count- The number of elements to `skip` to form the new dataset.- Returns:
- A new Dataset with `count` elements removed.
-
take
Returns a new `Dataset` with only the first `count` elements from this dataset.- Parameters:
count- The number of elements to "take" from this dataset.- Returns:
- A new Dataset containing the first `count` elements from this dataset.
-
mapOneComponent
Returns a new Dataset which maps a function across all elements from this dataset, on a single component of each element.For example, suppose each element is a
List<Operand<?>>with 2 components: (features, labels).Calling
dataset.mapOneComponent(0, features -> tf.math.mul(features, tf.constant(2)))will map the function over the `features` component of each element, multiplying each by 2.- Parameters:
index- The index of the component to transform.mapper- The function to apply to the target component.- Returns:
- A new Dataset applying `mapper` to the component at the chosen index.
-
mapAllComponents
Returns a new Dataset which maps a function across all elements from this dataset, on all components of each element.For example, suppose each element is a
List<Operand<?>>with 2 components: (features, labels).Calling
dataset.mapAllComponents(component -> tf.math.mul(component, tf.constant(2)))will map the function over the both the `features` and `labels` components of each element, multiplying them all by 2- Parameters:
mapper- The function to apply to each component- Returns:
- A new Dataset applying `mapper` to all components of each element.
-
map
Returns a new Dataset which maps a function over all elements returned by this dataset.For example, suppose each element is a
List<Operand<?>>with 2 components: (features, labels).Calling
will map the function over the `features` and `labels` components, multiplying features by 2, and multiplying the labels by 5.dataset.map(components -> { Operand<?> features = components.get(0); Operand<?> labels = components.get(1); return Arrays.asList( tf.math.mul(features, tf.constant(2)), tf.math.mul(labels, tf.constant(5)) ); });- Parameters:
mapper- The function to apply to each element of this iterator.- Returns:
- A new Dataset applying `mapper` to each element of this iterator.
-
iterator
Creates an iterator which iterates through all batches of this Dataset in an eager fashion. Each batch is a list of components, returned as `Output` objects.This method enables for-each iteration through batches when running in eager mode. For Graph mode batch iteration, see `makeOneShotIterator`.
-
makeInitializeableIterator
Creates a `DatasetIterator` that can be used to iterate over elements of this dataset.This iterator will have to be initialized with a call to `iterator.makeInitializer(Dataset)` before elements can be retreived in a loop.
- Returns:
- A new `DatasetIterator` based on this dataset's structure.
-
makeOneShotIterator
Creates a `DatasetIterator` that can be used to iterate over elements of this dataset. Using `makeOneShotIterator` ensures that the iterator is automatically initialized on this dataset. skips In graph mode, the initializer op will be added to the Graph's intitializer list, which must be run via `tf.init()`:Ex:
try (Session session = new Session(graph) { // Immediately run initializers session.initialize(); }In eager mode, the initializer will be run automatically as a result of this call.
- Returns:
- A new `DatasetIterator` based on this dataset's structure.
-
fromTensorSlices
public static Dataset fromTensorSlices(Ops tf, List<Operand<?>> tensors, List<Class<? extends TType>> outputTypes) Creates an in-memory `Dataset` whose elements are slices of the given tensors. Each element of this dataset will be aList<Operand<?>>, representing slices (e.g. batches) of the provided tensors.- Parameters:
tf- Ops Accessortensors- A list ofOperand<?>representing components of this dataset (e.g. features, labels)outputTypes- A list of tensor type classes representing the data type of each component of this dataset.- Returns:
- A new `Dataset`
-
tfRecordDataset
public static Dataset tfRecordDataset(Ops tf, String filename, String compressionType, long bufferSize) Creates a TFRecordDataset from a file containing TFRecords- Parameters:
tf- the TensorFlow Opsfilename- the file name that holds the TFRecordscompressionType- the compresstion type for the filebufferSize- the buffersize for processing the TFRecords file.- Returns:
- a TFRecordDataset
-
textLineDataset
public static Dataset textLineDataset(Ops tf, String filename, String compressionType, long bufferSize) Creates a TextLineDataset from a file containing one recored per ling.- Parameters:
tf- the TensorFlow Opsfilename- the file name that holds the data recordscompressionType- the compresstion type for the filebufferSize- the buffersize for processing the records file.- Returns:
- a TextLineDataset
-
getVariant
Gets the variant tensor representing this dataset.- Returns:
- the variant tensor representing this dataset.
-
getOutputTypes
-
getOutputShapes
-
getOpsInstance
Gets the TensorFlow Ops instance for this dataset- Returns:
- the TensorFlow Ops instance for this dataset
-
toString
-