Until now you've mostly been working with self-contained command line programs. It's time to start branching out and interacting with files.
Working with files and data streams is a large topic in its own right and one which gets deep into the file system and operating system levels. For our purposes, we'll stick to the bits you're most likely to use and leave any further exploration up to you.
In this lesson, we'll cover the very basics of what files are and how these streams of bytes are handled. The next lesson will expose how you can apply these concepts in Ruby.
A file is an ordered and named collection of bytes that has persistent storage. -- MSDN
Files are basically just collections of bits and bytes that you'll somehow need to open, read into your program, modify, and save.
Even though many files (like images) look like a giant jumble of data when you open them up in a text editor, it can be helpful to think of all files as one really long string, or stream, of bytes. Your script will read them in from top to bottom, performing whatever operations you specify along the way.
Behind the scenes, a file is literally just a chunk of memory that contains a long set of bytes.
We'll be using 3 similar terms here so it's worth distinguishing between them:
Streams are the generic way of describing raw chunks of bytes. Think of them as lengths of chain, where each link is a byte.
Stream is also the specific name of the class used to work with streams of bytes, though we won't encounter it much directly.
Files are just streams that have been neatly contained somewhere on your hard drive by defining some meta data like their location and access privileges. Think of them like chains that have been attached to a roll and stored in a warehouse. The "File" object you work with is really just a reference to the beginning of this chain in memory plus its metadata.
String is a formal data type that a programming language can handle. Streams of character bytes are converted into strings so you can actually work with them in your scripts.
When programs interact with each other and your operating system, they open and close streams of bytes like files. There are two particularly important and useful streams that you interact with all the time -- STDIN ("Standard In") and STDOUT ("Standard Out").
STDOUT is the stream that gets output to your command line. So anything that gets
puts'd to the screen has been sent to STDOUT.
STDIN is the input stream, most commonly linked to your keyboard but occasionally can be sent in from other programs as well. This is what
There are some other useful streams that you'll bump into later including STDERR which contains error output.
You won't need to work with these terms directly, but it's important to have heard them in context so you know what people are talking about when they are referred to. They are cornerstones of passing data.
In the world of web applications, file I/O isn't as common as with client-side apps but you'll still run into it from time to time. Seeing files as simply continuous chains of bytes in memory is a very useful mental model to have. It should help you demystify how your computer works and prepare you to accept that all data in the programming world gets read as single streams, one byte at a time.
In other lower-level languages like Java or C, this file I/O can be much more of a pain. You end up working with the bare-metal file streams, counting bytes, and generally fighting with complexity. Ruby takes away all that and lets you work with files in a simple and straightforward manner.
We'll dive into this in the next lesson.