Enhanced C#
Language of your choice: library documentation

Documentation moved to ecsharp.net

GitHub doesn't support HTTP redirects, so you'll be redirected in 3 seconds.

 All Classes Namespaces Functions Variables Enumerations Enumerator Properties Events Pages
Properties | Public Member Functions | Protected Member Functions | Protected fields | List of all members
Loyc.Syntax.StreamCharSource Class Reference

Exposes a stream as an ICharSource, as though it were an array of characters. The stream must support seeking, and if a text decoder is specified, it must meet certain constraints. More...


Source file:
Inheritance diagram for Loyc.Syntax.StreamCharSource:
Loyc.Collections.ICharSource Loyc.Collections.IListSource< out T >

Remarks

Exposes a stream as an ICharSource, as though it were an array of characters. The stream must support seeking, and if a text decoder is specified, it must meet certain constraints.

This class reads small blocks of bytes from a stream, reloading blocks from the stream when necessary. Data is cached with a pair of character buffers, and a third buffer is used to read from the stream. A Stream is required rather than a TextReader because TextReader doesn't support seeking.

This class assumes the underlying stream never changes.

The stream does not (and probably cannot, if I understand the System.Text.Decoder API correctly) save the decoder state at each block boundary. Consequently, only encodings that meet special constraints will work with StreamCharSource. These include Encoding.Unicode, Encoding.UTF8, and Encoding.UTF32, but not Encoding.UTF7. Using unsupported encodings will cause exceptions and/or or corrupted data output while reading from the StreamCharSource.

The decoder must meet the following constraints:

  1. Characters must be divided on a byte boundary. UTF-7 doesn't work because some characters are encoded using Base64.
  2. Between characters output by the decoder, the decoder must be stateless. Therefore, encodings that support compression generally won't work.
  3. The decoder must produce at least one character from a group of 8 bytes (StreamCharSource.MaxSeqSize).

Properties

override int Count [get]
 

Public Member Functions

 StreamCharSource (Stream stream)
 
 StreamCharSource (Stream stream, Decoder decoder)
 
 StreamCharSource (Stream stream, Encoding encoding)
 
 StreamCharSource (Stream stream, Decoder decoder, int bufSize)
 
new StringSlice Slice (int startIndex, int length)
 Returns a substring from the character source. If some of the requested characters are past the end of the stream, the string is truncated to the available number of characters. More...
 
sealed override char TryGet (int index, out bool fail)
 Gets the item at the specified index, and does not throw an exception on failure. More...
 

Protected Member Functions

void SwapBlks ()
 
bool Access (int charIndex)
 
void ReloadBlockOf (int charIndex)
 
void ScanPast (int index)
 
void ReadNextBlock ()
 

Protected fields

Stream _stream
 
byte[] _buf
 
char[] _blk
 
int _blkStart
 
int _blkLen
 
List< Pair< int, uint > > _blkOffsets = new List<Pair<int,uint>>()
 A sorted list of mappings between byte positions and character indexes. In each Pair(of A,B), A is the character index and B is the byte index. This list is built on-demand. More...
 
bool _reachedEnd = false
 Set true when the last block has been scanned. If true, then _eofIndex and _eofPosition indicate the Count and the size of the stream, respectively. More...
 
int _eofIndex = 0
 _eofIndex is the character index of EOF if it has been reached or, if not, the index of the first unscanned character. _eofIndex equals _blkOffsets[_blkOffsets.Count-1].A. More...
 
uint _eofPosition = 0
 _eofPosition is the byte position of EOF if it has been reached or, if not, the position of the first unscanned character. _eofPosition equals _blkOffsets[_blkOffsets.Count-1].B. More...
 
Decoder _decoder
 
const int DefaultBufSize = 2048 + MaxSeqSize - 1
 
const int MaxSeqSize = 8
 

Member Function Documentation

new StringSlice Loyc.Syntax.StreamCharSource.Slice ( int  startIndex,
int  length 
)
inline

Returns a substring from the character source. If some of the requested characters are past the end of the stream, the string is truncated to the available number of characters.

Parameters
startIndexIndex of first character to return. If startIndex >= Count, an empty string is returned.
lengthNumber of characters desired.
Exceptions
ArgumentExceptionThrown if startIndex or length are negative.

Implements Loyc.Collections.ICharSource.

sealed override char Loyc.Syntax.StreamCharSource.TryGet ( int  index,
out bool  fail 
)
inline

Gets the item at the specified index, and does not throw an exception on failure.

Parameters
indexAn index in the range 0 to Count-1.
failA flag that is set on failure.
Returns
The element at the specified index, or default(T) if the index is not valid.

In my original design, the caller could provide a value to return on failure, but this would not allow T to be marked as "out" in C# 4. For the same reason, we cannot have a ref/out T parameter. Instead, the following extension methods are provided:

bool TryGet(int index, ref T value);
T TryGet(int, T defaultValue);

Implements Loyc.Collections.IListSource< out T >.

Member Data Documentation

List<Pair<int,uint> > Loyc.Syntax.StreamCharSource._blkOffsets = new List<Pair<int,uint>>()
protected

A sorted list of mappings between byte positions and character indexes. In each Pair(of A,B), A is the character index and B is the byte index. This list is built on-demand.

int Loyc.Syntax.StreamCharSource._eofIndex = 0
protected

_eofIndex is the character index of EOF if it has been reached or, if not, the index of the first unscanned character. _eofIndex equals _blkOffsets[_blkOffsets.Count-1].A.

uint Loyc.Syntax.StreamCharSource._eofPosition = 0
protected

_eofPosition is the byte position of EOF if it has been reached or, if not, the position of the first unscanned character. _eofPosition equals _blkOffsets[_blkOffsets.Count-1].B.

bool Loyc.Syntax.StreamCharSource._reachedEnd = false
protected

Set true when the last block has been scanned. If true, then _eofIndex and _eofPosition indicate the Count and the size of the stream, respectively.