|
Thrill
0.1
|
Namespaces | |
| glob_local | |
Classes | |
| struct | FileInfo |
| General information of vfs file. More... | |
| struct | FileList |
| List of file info and additional overall info. More... | |
| class | ReadStream |
| Reader object from any source. More... | |
| class | TemporaryDirectory |
| A class which creates a temporary directory in the current directory and returns it via get(). More... | |
| class | WriteStream |
| Writer object to output data to any supported URI. More... | |
Typedefs | |
| using | ReadStreamPtr = tlx::CountingPtr< ReadStream > |
| using | WriteStreamPtr = tlx::CountingPtr< WriteStream > |
Enumerations | |
| enum | GlobType { All, File, Directory } |
| Type of objects to include in glob result. More... | |
| enum | Type { File, Directory } |
| VFS object type. More... | |
Functions | |
| void | Deinitialize () |
| Deinitialize VFS layer. More... | |
| std::string | FillFilePattern (const std::string &pathbase, size_t worker, size_t file_part) |
| FileList | Glob (const std::vector< std::string > &globlist, const GlobType >ype=GlobType::All) |
| Reads a glob path list and deliver a file list, sizes, and prefixsums (in bytes) for all matching files. More... | |
| FileList | Glob (const std::string &glob, const GlobType >ype=GlobType::All) |
| Reads a glob path list and deliver a file list, sizes, and prefixsums (in bytes) for all matching files. More... | |
| void | Hdfs3Deinitialize () |
| void | Hdfs3Glob (const std::string &, const GlobType &, FileList &) |
| void | Hdfs3Initialize () |
| ReadStreamPtr | Hdfs3OpenReadStream (const std::string &, const common::Range &) |
| WriteStreamPtr | Hdfs3OpenWriteStream (const std::string &) |
| void | Initialize () |
| Initialize VFS layer. More... | |
| bool | IsCompressed (const std::string &path) |
| bool | IsRemoteUri (const std::string &path) |
| Returns true, if file at filepath is a remote uri like s3:// or hdfs://. More... | |
| ReadStreamPtr | MakeBZip2ReadFilter (const ReadStreamPtr &) |
| WriteStreamPtr | MakeBZip2WriteFilter (const WriteStreamPtr &) |
| ReadStreamPtr | MakeGZipReadFilter (const ReadStreamPtr &) |
| WriteStreamPtr | MakeGZipWriteFilter (const WriteStreamPtr &) |
| ReadStreamPtr | OpenReadStream (const std::string &path, const common::Range &range=common::Range()) |
| Construct reader for given path uri. More... | |
| WriteStreamPtr | OpenWriteStream (const std::string &path) |
| std::ostream & | operator<< (std::ostream &os, const Type &t) |
| void | S3Deinitialize () |
| void | S3Glob (const std::string &, const GlobType &, FileList &) |
| void | S3Initialize () |
| ReadStreamPtr | S3OpenReadStream (const std::string &, const common::Range &) |
| WriteStreamPtr | S3OpenWriteStream (const std::string &) |
| void | SysGlob (const std::string &path, const GlobType >ype, FileList &filelist) |
| Glob a path and augment the FileList with matching file names. More... | |
| static void | SysGlobWalkRecursive (const std::string &path, FileList &filelist) |
| ReadStreamPtr | SysOpenReadStream (const std::string &path, const common::Range &range=common::Range()) |
| Open file for reading and return file descriptor. More... | |
| WriteStreamPtr | SysOpenWriteStream (const std::string &path) |
| Open file for writing and return file descriptor. More... | |
| using ReadStreamPtr = tlx::CountingPtr<ReadStream> |
Definition at line 145 of file file_io.hpp.
| using WriteStreamPtr = tlx::CountingPtr<WriteStream> |
Definition at line 146 of file file_io.hpp.
|
strong |
Type of objects to include in glob result.
| Enumerator | |
|---|---|
| All | |
| File | |
| Directory | |
Definition at line 99 of file file_io.hpp.
|
strong |
| void Deinitialize | ( | ) |
Deinitialize VFS layer.
Definition at line 40 of file file_io.cpp.
References Hdfs3Deinitialize(), and S3Deinitialize().
Referenced by thrill::api::Deinitialize(), and main().
| std::string FillFilePattern | ( | const std::string & | pathbase, |
| size_t | worker, | ||
| size_t | file_part | ||
| ) |
function which takes pathbase and replaces $$$ with worker and ### with the file_part values.
Definition at line 71 of file file_io.cpp.
References debug, sLOG, and tlx::ssnprintf().
Referenced by WriteBinaryNode< ValueType >::OpenNextFile(), WriteLinesNode< ValueType >::PreOp(), and WriteLinesNode< ValueType >::WriteLinesNode().
Reads a glob path list and deliver a file list, sizes, and prefixsums (in bytes) for all matching files.
Definition at line 128 of file file_io.cpp.
References FileList::contains_compressed, FileList::contains_remote_uri, Hdfs3Glob(), S3Glob(), tlx::starts_with(), SysGlob(), and FileList::total_size.
Referenced by Glob(), main(), ReadBinaryNode< ValueType >::ReadBinaryNode(), and ReadLinesNode::ReadLinesNode().
Reads a glob path list and deliver a file list, sizes, and prefixsums (in bytes) for all matching files.
Definition at line 172 of file file_io.cpp.
References Glob().
| void Hdfs3Deinitialize | ( | ) |
Definition at line 295 of file hdfs3_file.cpp.
Referenced by Deinitialize().
| void Hdfs3Initialize | ( | ) |
Definition at line 292 of file hdfs3_file.cpp.
Referenced by Initialize().
| ReadStreamPtr Hdfs3OpenReadStream | ( | const std::string & | , |
| const common::Range & | |||
| ) |
| WriteStreamPtr Hdfs3OpenWriteStream | ( | const std::string & | ) |
| void Initialize | ( | ) |
Initialize VFS layer.
Definition at line 35 of file file_io.cpp.
References Hdfs3Initialize(), and S3Initialize().
Referenced by thrill::api::Initialize(), and main().
| bool IsCompressed | ( | const std::string & | path | ) |
Returns true, if file at filepath is compressed (e.g, ends with '.{gz,bz2,xz,lzo}')
Definition at line 47 of file file_io.cpp.
References tlx::ends_with().
Referenced by FileInfo::IsCompressed().
| bool IsRemoteUri | ( | const std::string & | path | ) |
Returns true, if file at filepath is a remote uri like s3:// or hdfs://.
Definition at line 55 of file file_io.cpp.
References tlx::starts_with().
Referenced by FileInfo::IsRemoteUri().
| ReadStreamPtr MakeBZip2ReadFilter | ( | const ReadStreamPtr & | ) |
| WriteStreamPtr MakeBZip2WriteFilter | ( | const WriteStreamPtr & | ) |
| ReadStreamPtr MakeGZipReadFilter | ( | const ReadStreamPtr & | ) |
| WriteStreamPtr MakeGZipWriteFilter | ( | const WriteStreamPtr & | ) |
| ReadStreamPtr OpenReadStream | ( | const std::string & | path, |
| const common::Range & | range = common::Range() |
||
| ) |
Construct reader for given path uri.
Range is the byte range [b,e) inside the file to read. If e = 0, the complete file is read.
For the POSIX SysFile implementation the range is used only to seek to the byte offset b. It allows additional bytes after e to be read.
For the S3File implementations, however, the range[b,e) is used to determine which data to fetch from S3. Hence, once e is reached, read() will return EOF.
Definition at line 180 of file file_io.cpp.
References Range::begin, die_unless, tlx::ends_with(), Hdfs3OpenReadStream(), MakeBZip2ReadFilter(), MakeGZipReadFilter(), S3OpenReadStream(), tlx::starts_with(), and SysOpenReadStream().
Referenced by ReadLinesNode::InputLineIteratorCompressed::HasNext(), ReadLinesNode::InputLineIteratorCompressed::InputLineIteratorCompressed(), ReadLinesNode::InputLineIteratorUncompressed::InputLineIteratorUncompressed(), main(), ReadLinesNode::InputLineIteratorUncompressed::Next(), ReadLinesNode::InputLineIteratorCompressed::Next(), and ReadBinaryNode< ValueType >::VfsFileBlockSource::VfsFileBlockSource().
| WriteStreamPtr OpenWriteStream | ( | const std::string & | path | ) |
Definition at line 211 of file file_io.cpp.
References tlx::ends_with(), Hdfs3OpenWriteStream(), MakeBZip2WriteFilter(), MakeGZipWriteFilter(), S3OpenWriteStream(), tlx::starts_with(), and SysOpenWriteStream().
Referenced by main(), WriteLinesNode< ValueType >::PreOp(), and WriteLinesNode< ValueType >::WriteLinesNode().
| std::ostream & operator<< | ( | std::ostream & | os, |
| const Type & | t | ||
| ) |
Definition at line 60 of file file_io.cpp.
| void S3Deinitialize | ( | ) |
Definition at line 734 of file s3_file.cpp.
Referenced by Deinitialize().
| void S3Initialize | ( | ) |
Definition at line 731 of file s3_file.cpp.
Referenced by Initialize().
| ReadStreamPtr S3OpenReadStream | ( | const std::string & | , |
| const common::Range & | |||
| ) |
| WriteStreamPtr S3OpenWriteStream | ( | const std::string & | ) |
Glob a path and augment the FileList with matching file names.
Definition at line 144 of file sys_file.cpp.
References All, CSimpleGlob, debug, die, Directory, File, LOG1, FileInfo::path, FileInfo::size, sLOG, SysGlobWalkRecursive(), thrill::mem::to_string(), and FileInfo::type.
Referenced by Glob().
|
static |
Definition at line 55 of file sys_file.cpp.
References Directory, File, FileInfo::path, FileInfo::size, thrill::mem::to_string(), thrill::common::ts_readdir(), and FileInfo::type.
Referenced by SysGlob().
| ReadStreamPtr SysOpenReadStream | ( | const std::string & | path, |
| const common::Range & | range = common::Range() |
||
| ) |
Open file for reading and return file descriptor.
Handles compressed files by calling a decompressor in a pipe, like "cat $f | gzip -dc |" in bash.
| path | Path to open |
| range | Byte range to read. begin of range is use to seek to, end can be 0 for reading the whole file. Depending on the underlying fs, one can read past end without errors, it is not enforced. |
POSIX lseek function from current position.
POSIX lseek function from current position.
Definition at line 323 of file sys_file.cpp.
References Range::begin, debug, tlx::ends_with(), LOG1, thrill::common::MakePipe(), O_BINARY, thrill::common::PortSetCloseOnExec(), and sLOG.
Referenced by OpenReadStream().
| WriteStreamPtr SysOpenWriteStream | ( | const std::string & | path | ) |
Open file for writing and return file descriptor.
Handles compressed files by calling a compressor in a pipe, like "| gzip -d > $f" in bash.
| path | Path to open |
Definition at line 414 of file sys_file.cpp.
References debug, tlx::ends_with(), LOG1, thrill::common::MakePipe(), O_BINARY, thrill::common::PortSetCloseOnExec(), and sLOG.
Referenced by OpenWriteStream().