| Module | RIO::IF::Grande | lib/rio/if/grande.rb |
Grande Copy-From Operator
The copy-from grande-operator copies a Rio from another Rio or another ruby object. Its operation is dependent on the the file system objects referenced, the rio options set, and the state of its source and destination. In the broadest of terms it could be described as doing the following:
source.each do |entry|
destination << entry
end
That is to say, it iterates through its argument, calling the copy-from operator again for each element. While it is not implemented like this, and the above code would not give the same results, This generalized description is convenient.
For example the code:
destination < source
# is like
source.each { |element| destination << element }
for any of the following definitions of src and dst
src = rio('afile')
dst = rio('acopy')
src = rio('afile').lines(0..9)
dst = rio('acopy')
src = rio('srcdir')
dst = rio('dstdir')
src = rio('srcdir').dirs(/^\./).files('*.tmp')
dst = rio('dstdir')
src = ["line0\n","line1\n"]
dst = rio('afile')
array = [rio("file1"),rio("file2")]
dst = rio('adir')
Arrays are handled differently depending on whether the rio references a file or a directory.
dest = rio('afile')
dest < array
# is roughly equivelent to
array.each do |el|
case el
when ::String then dest.print(el)
when ::Rio then dest << el
else dest << rio(el)
end
dest = rio('adir')
dest < array
# is roughly equivelent to
array.each do |el|
case el
when ::String then rio(el)
when ::Rio then dest << el
else dest << rio(el)
end
To improve run-time efficiency, Rio will choose from among several strategies when copying. For instance when no file or directory filtering is specified, FileUtils#cp_r is used to copy directories; and when no line filtering is specified, FileUtils#cp is used to copy files.
rio('adir') < rio('anotherdir') # 'anotherdir' is copied to 'adir' using FileUtils#cp_r
rio('adir') < rio('anotherdir').files('*.rb') # copy only .rb files
rio('afile') < rio('anotherfile') # 'anotherfile' is copied to 'afile' using FileUtils#cp
rio('afile') < ios # ios must be an IO object opened for reading
rio('afile') < astring # basically the same as rio('afile').print(astring)
anarray = [ astring, rio('anotherfile') ]
rio('afile') < anarray # copies each element to 'afile' as if one had written
ario = rio('afile')
anarray.each do |el|
ario << el
end
ario.close
rio('skeldir') < rio('adir').dirs # copy only the directory structure
rio('destdir') < rio('adir').dirs.files(/^\./) # copy the directory structure and all dot files
Grande Append-From Operator
The append-from grande-operator copies a Rio from another Rio or another ruby object. This behaves like < (copy-from) except unopened Rios are opened for append.
The following summarizes how objects are copied:
| IO: | IO#each is used to iterate through the source with each record appended to the Rio |
| Array: | Each element of the Array is appended individually to the Rio. |
| String: | The string is appended to the Rio using print |
| Rio: | The source Rio is appended using its >> (append-to) operator |
See < (copy-from)
Grande Copy-To Operator
The copy grande-operator copies a Rio to a another Rio or another ruby object. The behaviour and the library used depend on the types of the of the source and destination. For simple file or directory copying ::FileUtils#cp or ::FileUtils#cp_r will be used. If any of the Rio grande methods are specified for the source or destination, the source Rio will be iterated through copying records to the destintion as specified. Roughly equivelant to
dst = rio('dst_file')
rio('src_file').each do |line|
dst.print(line)
end
dst.close
The destination of the copy operators may be a:
| IO: | Each record of the Rio is written to the IO using IO#print. The IO must be opened for writing. |
| Array: | Each record or entry of the Rio becomes an element of the array |
| String: | Puts the entire contents of the Rio into the string |
| Rio: | Depends on the destination. See below. |
Copy a file to a file
rio('src_file') > rio('dst_file')
Copy a file to a directory
rio('src_file') > rio('dst_dir')
Copy a directory to another directory
rio('src_dir') > rio('dst_dir')
Make an ungizipped copy of a gzipped file
rio('src.txt.gz').gzip > rio('dst.txt')
Copying to an array
rio('afile') > ary # each line of the file becomes an element of the ary
rio('afile').chomp > ary # same thing with lines chomped
rio('afile.gz').gzip.chomp > ary # same thing from a gzipped file
rio('afile').lines(0..9) > ary # ary will contain only the first ten lines of the file
rio('afile').chomp.lines(0..9) > ary # same thing with lines chomped
rio('afile').gzip.chomp.lines(0..9) > ary # same thing from a gzipped file
rio('afile').skiplines(0..9) > ary # ary will contain all but the first ten lines of the file
rio('adir') > ary # ary will contain a Rio for each entry in the directory
rio('adir').files > ary # same, but only files
rio('adir').files('*.rb') >ary # same, but only .rb files
Copying to a string
rio('afile') > astring # slurp the entire contents of the file into astring
astring = rio('afile').contents # same effect
Copy the first line and every line containing the word Rio into a gzipped file
rio('src').lines(1,/Rio/) > rio('dst.gz').gzip
Copy lines of a web page into an array with each line chomped
rio('http://ruby-doc.org/index.html').chomp > an_array
Copy the first and 8th through 10th columns of the first ten rows of a gzipped csv file on a web site into a local gzipped csv file that uses semi-colons as separators
rio('http://host/file.csv.gz').columns(0,7..9).gzip.csv[0..9] > rio('localfile.csv.gz').csv(';').gzip
Grande Append-To Operator
The append-to grande-operator is the same as > (copy-to) except that it opens the destination for append. The destination can be a kind of:
| IO: | Each record of the Rio is written to the IO using IO#print. The IO must be opened for writing. |
| Array: | Each record or entry of the Rio is appended to the destination array |
| String: | Appends the entire contents of the Rio to destination |
| Rio: | Just like > (copy-to) except the unopened object are opened for append. If the destination is already opened for writing or is a directory, this is identical to > (copy-to) |
See > (copy-to)
rio('afile') >> rio('anotherfile') # append the contents of 'afile' to 'anotherfile'
rio('afile') >> rio('adir') # copies 'afile' to the directory 'adir'
rio('adir') >> rio('anotherdir') # copy directory 'adir' recursively to 'anotherdir'
rio('adir') >> array # a Rio for each entry in the directory will be appended to ary
Grande subscript operator.
For files this returns all or part of a file as an array.
For directories this returns all or some of the entries in a directory
This combines the record selection offered by records with the conversion to an array provided by to_a. The following two are equivelant:
ario[*args] ario.records(*args).to_a
What constitutes an array element is determined by lines, bytes, or by an extension such as csv. lines is the default.
Arguments may consist of zero or more integers, ranges, regular expressions, symbols, procs, and arrays An empty argument list selects all records
Records are selected as follows.
| range: | specifies a range of records to be selected (zero based) |
| regexp: | matching records will be selected. |
| integer: | treated like a one element range |
| symbol: | the symbol is sent to each record. Record is selected unless false is returned |
| proc: | the proc is called with the record as an argument. Record is selected unless false is returned |
| array: | the array may contain any of the other selector types. Record is selected unless any of the selectors returns false. (a logical and) |
A record matching any of the selectors will be included in the array. (acts like an or)
Because this is implemented in terms of the each, When only record ranges are used to select records, iteration will stop when the recno exceeds the maximum of any range. That is to say
This reads one record from a file and returns it
rio('bigfile.mp3').bytes(1024)[0]
While this reads all records from a file and returns the first one
rio('bigfile.mp3').bytes(1024).to_a[0]
This combines the entry selection offered by entries with the conversion to an array provided by to_a. The following two are equivelant:
ario[*args] ario.entries(*args).to_a
Arguments may consist of strings (treated as globs) or regular expressions. An empty argument list selects all entries See ::Dir#glob and ::File::fnmatch? for more in information on globs. Be warned that using the ’**’ glob recurses into directories independently of all and using both is unsupported.
ario = rio('adir')
ario[] # returns an array containg all entries in _adir_
ario[/^zippy/] # all entries starting with 'zippy'
ario['zippy*'] # same thing
As with each:
rio('adir').files['*.txt'] # array of all .txt files
rio('adir').dirs[/^\./] # array of all dot directories
rio('adir').all.files['*.[ch]'] # array of c source files in adir and its subdirecories
rio('adir').all.dirs[/^\.svn/] # array of subversion directories in adir and subdirectories
rio('adir').files('*.rb').dirs['ruby*'] # array of .rb files and
# directories starting with 'ruby'
rio('adir').dirs('ruby*').files['*.rb'] # same thing
This section applies similarly to lines, bytes, records, and rows
Using lines and related methods with a Rio referencing a directory imples files and will cause an array of the lines or bytes in the files to be returned. As above, the arguments to the subscript operator will be applied to the closest.
rio('adir').lines[] # array of all lines in the files in 'adir'
rio('adir').files.lines[] # same thing
rio('adir').lines(0..9).files['*.txt'] # array of the first ten lines of all .txt files
rio('adir').files('*.txt').lines[0..9] # same thing
rio('adir').all.files('*.rb').lines[/^\s*require/] # array of 'require' lines in .rb files in
# 'adir and its subdirectories
Note the difference between the following similar usages
it1 = rio('adir').files('*.rb') # returns a Rio, prepared for selecting ruby files
it2 = rio('adir').files['*.rb'] # returns an array of the ruby files
The second example above could have been written
it2 = it1.to_a
Examples:
rio('afile.txt').lines[1..2] # array containing the 2nd and 3rd line
rio('afile.txt')[1,3..5] # array containing lines 1,3,4 and 5
rio('afile.txt')[/Zippy/] # array of all lines containing 'Zippy'
rio('afile.txt')[1,3..5,/Zippy/] # array with lines 1,3,4 and 5 and all lines containing 'Zippy'
rio('afile.dat').bytes(50)[] # array containing the contents of afile.dat broken up into 50 byte chunks
rio('afile.dat').bytes(50)[0,2] # array containing the first and third such chunk
rio('afile.dat').bytes(50).records[0,2] # same thing
rio('afile.dat').bytes(50).records(0,2).to_a # once again
rio('afile.csv').csv[0..9] # array of the first 10 records of afile.csv parsed by the ::CSV module
rio('afile.csv').csv.records[0..9] # same thing
rio('afile.csv').csv(';').records[0..9] # same thing using semi-colon as the value separator
rio('afile.csv').csv.records[0,/Zippy/] # record 0 and all records containing 'Zippy'
# the regexp is matched against the line before parsing by ::CSV
rio('adir')[] # array of entries in 'adir'
rio('adir')['*.txt'] # array of all .txt entries
rio('adir').all['*.txt'] # array of all .txt entries in 'adir and its subdirectories
rio('adir').files['*.txt'] # array of all .txt files
rio('adir').dirs['CSV'] # array of all CSV directories
rio('adir').skipdirs['CSV'] # array of all non-CSV directories
For a file Rio delete calls FileUtils#rm. For a directory Rio delete calls FileUtils#rmdir Returns the Rio. If the Rio does not exist, simply return the Rio.
rio('afile,txt').delete # delete 'afile.txt'
rio('adir').delete # delete adir
rio('something').delete # delete something
For a file delete! calls FileUtils#rm. For a directory delete! calls FileUtils#rmtree. Returns the Rio. If the rio does not exist, simply returns itself.
rio('afile,txt').delete! # delete f.txt
rio('adir').delete! # delete adir
# create a directory, after deleting anything that previously had its name
rio('adir/asubdir').delete!.mkpath
In all cases, deleting something that does not exist is considered successful.
Iterate through a rio. Executes the block for each item selected for the Rio. See lines, records, bytes, files, dirs, [] and to_a for more information on how records are selected and what kind of record is passed to the block.
each is the fundemental method for all the Rio grande operators. to_a and the Rio copy operators <, <<, >>, and > are all implemented in terms of each.
While each is fundamental to a Rio, it rarely needs actually be called because all the grande configuration methods will also take a block and call each if one is given. So the existance of a block after many methods is taken as an implied each
For Rios that refer to files, the item passed to the block is a String containing the line or block as selected by lines, or bytes. lines is the default.
rio('afile').lines.each { |line| ...}
The block passed to each will also accept an optional second parameter which will contain the result of the matching function. What this variable contains depends on the argument to lines that resulted in the match as follows:
| Regexp: | The MatchData that resulted from the match. |
| Range: | The record number of the matching record. |
| Fixnum: | The record number of the matching record. |
| Proc: | The value returned by the proc. |
| Symbol: | The value resulting from sending the symbol to the String. |
If no selection arguments were used, this variable will simply contain true.
rio(??).puts(%w[0:zero 1:one]).rewind.lines(/(\d+):([a-z]+)/) do |line,match|
puts("#{match[1]} is spelled '#{match[2]}'")
end
Produces:
0 is spelled 'zero' 1 is spelled 'one'
For Rios that refer to directories, the item passed to the block is a Rio refering to the directory entry.
rio('adir').files.each do |file|
file.kind_of?(RIO::Rio) # true
end
In addition, the Rio passed to the block inherits certain attributes from the directory Rio.
rio('adir').files.chomp.each do |file| # chomp is ignored for directories,
file.each do |line| # chomp attribute is inherited by the file rio
# .. line is chomped
end
end
each returns the Rio which called it.
Here are a few illustrative examples
rio('f.txt').each { |line| ... } # execute block for every line in the file
rio('f.txt').lines.each { |line| ... } # same thing
rio('f.txt').lines { |line| ... } # same thing
rio('f.txt').chomp.each { |line| ... } # same as above with lines chomped
rio('f.txt').chomp { |line| ... } # ditto
rio('f.txt').lines.chomp { |line| ... } # ditto
rio('f.txt').chomp.lines { |line| ... } # ditto
rio('f.txt.gz').gzip.each { |line| ... } # execute block for every line in a gzipped file
rio('f.txt.gz').gzip { |line| ... } # same thing
rio('f.txt.gz').lines.gzip { |line| ... } # same thing
rio('f.txt.gz').gzip.chomp { |line| ... } # chomp lines from a gzipped file
rio('f.txt.gz').gzip.chomp.each { |line| ... } # ditto
rio('f.txt.gz').chomp.lines.gzip { |line| ... } # ditto
rio('f.txt').lines(0..9) { |line| ... } # execute block for the first 10 lines in the file
rio('f.txt').lines(0..9).each { |line| ... } # same thing
rio('f.txt').lines(/^\s*#/) { |line| ... } # execute block for comment-only lines
rio('f.txt').lines(/^\s*#/).each { |line| ... } # same thing
rio('f.txt').lines(0,/Rio/) { |line| ... } # execute block for the first line and
# all lines containing 'Rio'
rio('f.txt.gz').gzip.chomp.lines(0..1) { |line| ... } # first 2 lines chomped from a gzip file
rio('f.dat').bytes(10).each { |data| ... } # process the file 10 bytes at a time
rio('f.dat').bytes(10) { |data| ... } # same thing
rio('f.dat').bytes(10).records(2,4) { |data| ... } # only 3rd and 5th ten-byte data-block
rio('f.dat.gz').gzip.records(2,4).bytes(10) { |data| ... } # same from a gzipped file
rio('adir').each { |ent| ... } # execute the block for each entry in the directory 'adir'
rio('adir').files.each { |file| ...} # only files
rio('adir').files { |file| ...} # ditto
rio('adir').all.files { |file| ...} # files, recurse into subdirectories
rio('adir').dirs { |dir| ...} # only directories
rio('adir').files('*.rb') { |file| ...} # only .rb files using a glob
rio('adir').files(/\.rb$/) { |file| ...} # only .rb files using a regular expression
rio('adir').all.files('*.rb') { |file| ...} # .rb files, recursing into subdirectories
rio('adir').dirs(/^\./) { |dir| ... } # only dot directories
rio('adir').dirs('/home/*') { |dir| ... } # home directories
See RIO::Doc::HOWTO and RIO::Doc::SYNOPSIS for more examples, and RIO::Doc::INTRO for further explanation.
Returns true if the referenced file or directory is empty after honoring the grande selection methods.
rio('f0').delete!.touch.empty? #=> true
rio('f1').puts!("Not Empty\n").empty? #=> false
rio('d0').delete!.mkdir.empty? #=> true
Reads and returns the next record or entry from a Rio, honoring the grande selection methods.
Returns nil on end of file.
See also records, lines, each, []
ario = rio('afile').lines(10..12)
line10 = ario.get
line11 = ario.get
line12 = ario.get
a_nil = ario.get
ario = rio('adir').entries('*.txt')
ent1 = ario.get
ent2 = ario.get
Grande Exclude method
skip can be used in two ways.
If called with no arguments it reverses the polarity of the next non-skip grande selection method that is called. That is, it turns lines, records, rows, files, dirs and entries into skiplines, skiprecords, skiprows, skipfiles, skipdirs, and skipentries, respectively.
rio('afile').skip.lines(0..5) # same as rio('afile').skiplines(0..5)
rio('adir').skip.files('*~') # same as rio('adir').skipfiles('*~')
Note that it only affects the next selection method seen — and may be used more than once. If no grande selection method is seen, skip is ignored.
When called with arguments it acts like skipentries for directory Rios and like skiprecords for stream Rios.
rio('afile').lines(/Rio/).skip[0..4] # lines containg 'Rio' excluding the
# first five lines
rio('adir').files('*.rb').skip[:symlink?] # .rb files, but not symlinks to
# .rb files
If a block is given, behaves as if skip(*args).each(&block) had been called.
Returns the Rio.
See skiplines, skiprecords, skiprows, skipfiles, skipdirs, and skipentries.
split has two distinct behaviors depending on whether or not it is called with an argument.
Returns an array of Rios, one for each path element. (Note that this behavior differs from File#split.)
rio('a/b/c').split #=> [rio('a'),rio('b'),rio('c')]
The array returned is extended with a to_rio method, which will put the parts back together again.
ary = rio('a/b/c').split #=> [rio('a'),rio('b'),rio('c')]
ary.to_rio #=> rio('a/b/c')
ary = rio('a/b/c').split #=> [rio('a'),rio('b'),rio('c')]
ary[1] = rio('d')
ary.to_rio #=> rio('a/d/c')
This causes String#split(arg) to be called on every line before it is returned. An array of the split lines is returned when iterating
rio('/etc/passwd').split(':').columns(0,2) { |ary|
username,uid = ary
}
rio('/etc/passwd').split(':').columns(0,2).to_a #=> [[user1,uid1],[user2,uid2]]
Returns the contents of the rio as an array. (See ::Enumerable#to_a)
to_a is implemented in terms of each so the the following are roughly equivelent
ary = ario.to_a ary = [] ario.each do |rec| ary << ary end
What constitutes an array element is determined by lines, bytes, records, rows or by an extension such as csv. lines is the default.
rio('afile.txt').to_a # returns an array of the lines in afile.txt
rio('afile.txt').lines(1...3).to_a # an array containing lines 1 and 2 of afile.txt
rio('afile.dat').bytes(50).to_a # an array containing the contents of afile.dat broken
# up into 50 byte chunks
See also [] (subscript operator)
Grande Pipe Operator
The Rio pipe operator is actually an alternative syntax for calling the > (copy-to) operator, designed to allow several copy operation to be performed in one line of code, with behavior that mimics the pipe operator commonly available in shells.
If destination is a cmdio, a cmdpipe Rio is returned, and none of the commands are run.
Otherwise the cmdpipe Rio is run with the output of the pipe being copied to the destination. In this case a Rio representing the destination is returned.
If destination is not a Rio it is passed to the Rio constructor as is done with the copy-to operator except that if destination is a String it is assumed to be a command instead of a path.
rio('afile') | rio(?-,'grep i') | rio(?-) # returns rio(?-)
# equivelent to rio(?-, 'grep i') < rio('afile') > rio(?-)
rio('infile') | rio(?-, 'acmd') | rio(?-, 'acmd2') | rio('outfile')
# same as
# acmd = rio(?-,'acmd')
# acmd2 = rio(?-,'acmd2')
# out = rio('outfile')
# acmd < rio('infile')
# acmd2 < acmd
# out < acmd2
rio('afile') | 'acmd' | 'acmd2' | rio('outfile') # same thing
acmdpipe = rio(?-,'acmd') | 'acmd2'
rio('afile') | acmdpipe | rio('outfile') # same thing
acmdpipe1 = rio(?|,'acmd','acmd2')
rio('afile') | acmdpipe1 | rio('outfile') # same thing
acmdpipe2 = rio('afile') | 'acmd' | 'acmd2'
acmdpipe2 | rio('outfile') # same thing
The grande pipe operator can not be used to create a cmdpipe Rio that includes a destination. This must be done using a Rio constructor
cmd_with_output = rio(?|,'acmd',rio('outfile'))
rio('afile') | cmd_with_output # same as above
Copyright © 2005,2006,2007 Christopher Kleckner. All rights reserved.