stlab.adobe.com Adobe Systems Incorporated

Rope

containers.gif
type.gif
Category: containers Component type: type

Description

Ropes are a scalable string implementation: they are designed for efficient operation that involve the string as a whole. Operations such as assignment, concatenation, and substring take time that is nearly independent of the length of the string. Unlike C strings, ropes are a reasonable representation for very long strings such as edit buffers or mail messages. [1]

Though ropes can be treated as Container of characters, and are almost Sequence, this is rarely the most efficient way to accomplish a task. Replacing an individual character in a rope is slow: each character replacement essentially consists of two substring operations followed by two concatenation operations. Ropes primarily target a more functional programming style.

They differ from Vector<char> or reference-counted string implementations in the following ways.

Advantages:

  • Much faster concatenation and substring operations involving long strings. Inserting a character in the middle of a 10 megabyte rope should take on the order of 10s of microseconds, even if a copy of the original is kept, e.g. as part of an edit history. In contrast, this would take on the order of a second for conventional "flat" string representation. The time required for concatenation can be viewed as constant for most applications. It is perfectly reasonable to use a rope as the representation of a file inside a text editor.
  • Potentially much better space performance. Minor modifications of a rope can share memory with the original. Ropes are allocated in small chunks, significantly reducing memory fragmentation problems introduced by large blocks.
  • Assignment is simply a (possibly reference counted) pointer assignment. Unlike reference-counted copy-on-write implementations, this remains largely true even if one of the copies is subsequently slightly modified. It is very inexpensive to checkpoint old versions of a string, e.g. in an edit history.
  • It is possible to view a function producing characters as a rope. Thus a piece of a rope may be a 100MByte file, which is read only when that section of the string is examined. Concatenating a string to the end of such a file does not involve reading the file. (Currently the implementation of this facility is incomplete.)

Disadvantages:

  • Single character replacements in a rope are expensive. A character update requires time roughly logarithmic in the length of the string. It is implemented as two substring operations followed by two concatenations.
  • A rope can be examined a character at a time through a const_iterator in amortized constant time, as for Vector<char>. However this is slower than for Vector<char> by a significant constant factor (roughly a factor of 5 or 10 if little processing is done on each character and the string is long). Nonconst iterators involve additional checking, and are hence a bit slower still. (We expect that eventually some common algorithms will be specialized so that this cost is not encountered. Currently only output, conversion to a C string, and the single-character find member function are treated in this way.)
  • Iterators are on the order of a dozen words in size. This means that copying them, though not tremendously expensive, is not a trivial operation. Avoid postincrementing iterators; use preincrement whenever possible. (The interface also provides primitives for indexing into a string using integer character positions. Passing positions around is clearly much cheaper, but this makes the indexing operation expensive, again roughly logarithmic in the length of the rope.)

Experience with previous implementations for other programming languages suggests that ropes are a good choice as the normal or default representation of strings in a program. It will occasionally be necessary to use some type of character array, such as Vector<char>, in places that are particularly sensitive to the performance of traversals or in-place updates. But the use of ropes minimizes the number of cases in which program running times become intolerable due to unexpectedly long string inputs.

A rope is almost, but not quite, a Sequence. It supports random access const_iterators. Forward or backward traversals take constant time per operation. Nonconstant iterators are also provided. However, assignment through a nonconst iterator is an expensive operation (basically logarithmic time, but with a large constant). It should be avoided in frequently executed code.

In order to discourage accidental use of expensive operations, the begin and end member functions on ropes return const_iterator. If non-const iterators are desired, the member functions mutable_begin and mutable_end should be used.

Any modification of a rope invalidates const iterators referring to the rope. Mutable iterators refer to the same position in the same rope after an update. (This may be surprising if the iterators refers to a position after an insertion point.) They remain valid unless the iterator refers to a position that is more than one past the end of the resulting rope.

Definition

Defined in the header rope, and in the backward-compatibility header rope.h. The rope class, and the rope header, are SGI extensions; they are not part of the C++ standard.

Example

crope r(1000000, 'x');          // crope is rope<char>. wrope is rope<wchar_t>
                                // Builds a rope containing a million 'x's.
                                // Takes much less than a MB, since the
                                // different pieces are shared.
crope r2 = r + "abc" + r;       // concatenation; takes on the order of 100s
                                // of machine instructions; fast
crope r3 = r2.substr(1000000, 3);       // yields "abc"; fast.
crope r4 = r2.substr(1000000, 1000000); // also fast.
reverse(r2.mutable_begin(), r2.mutable_end());
                                // correct, but slow; may take a
                                // minute or more.

Template parameters

Parameter Description Default
T The rope's value type: usually char or wchar_t. [2]  
Alloc The rope's allocator, used for all internal memory management. Allocators

Model of

RandomAccessContainer. Almost, but not quite, a model of FrontInsertionSequence and BackInsertionSequence.

Type requirements

None, except for those imposed by the requirements of RandomAccessContainer.

Public base classes

None.

Members

Member Where defined Description
value_type Container The rope's value type T, usually char or wchar_t.
difference_type Container A signed integral type.
size_type Container An unsigned integral type.
reference Container Reference to a rope element. [3]
const_reference Container Const reference to T. [3]
pointer Container Pointer to T. [3]
const_pointer Container Const pointer to T. [3]
const_reverse_iterator ReversibleContainer Const iterator used to iterate backwards through a rope.
reverse_iterator ReversibleContainer Mutable iterator used to iterate backwards through a rope.
iterator Container Mutable RandomAccessIterator used to iterate through a rope.
const_iterator Container Const RandomAccessIterator used to iterate through a rope.
rope(const charT* s) rope. Constructs a rope from a C string.
rope(const charT* s, size_t n) rope. Constructs a rope from a (not necessarily null-terminated) array of charT.
rope(const const_iterator& f, const const_iterator& l) Sequence Creates a rope with a copy of a range.
rope(const iterator& f, const iterator& l) Sequence Creates a rope with a copy of a range.
rope(const charT* f, const charT* l) Sequence Creates a rope with a copy of a range.
rope(charT c) rope. Single-character constructor.
rope() Container Default constructor.
rope(char_producer<charT>*, size_t, bool) rope See below.
rope(const rope& x) Container The copy constructor.
~rope() Container The destructor.
rope& operator=(const rope&x) Container The assignment operator.
void swap(rope& x) Container Swaps the contents of two ropes.
size_type size() const Container Returns the size of the rope.
size_type length() const rope Same as size
size_type max_size() const Container Size of longest rope guaranteed to be representable.
bool empty() const Container Equivalent to size() == 0.
const_iterator begin() const Container Returns an const_iterator pointing to the beginning of the rope.
const_iterator end() const Container Returns an const_iterator pointing to the end of the rope.
iterator mutable_begin() rope Returns an iterator pointing to the beginning of the rope.
iterator mutable_end() rope Returns an iterator pointing to the end of the rope.
const_reverse_iterator rbegin() const ReversibleContainer Returns a const_reverse_iterator pointing to the beginning of the reversed rope
const_reverse_iterator rend() const ReversibleContainer Returns a const_reverse_iterator pointing to the end of the reversed rope
iterator mutable_rbegin() rope Returns a reverse_iterator pointing to the beginning of the reversed rope.
iterator mutable_rend() rope Returns a reverse_iterator pointing to the end of the reversed rope.
charT operator[](size_type n) const RandomAccessContainer Returns the n'th element.
charT at(size_type pos) const RandomAccessContainer Returns the n'th element.
reference mutable_reference_at(size_type n) rope Returns a reference to the nth element.
int compare(const rope&) const rope. Three-way comparison. See below.
charT front() const Sequence Returns the first element.
charT back() const BackInsertionSequence Returns the last element.
void push_front() FrontInsertionSequence Inserts a new element at the front.
void push_back(charT) BackInsertionSequence Inserts a new element at the end.
void pop_front() FrontInsertionSequence Removes the first element.
void pop_back() BackInsertionSequence Removes the last element.
iterator insert(const iterator& p, const rope& x) rope Inserts the contents of x before p.
iterator insert(const iterator& p, charT c) Sequence Inserts c before p.
iterator insert(const iterator& p) Sequence Inserts charT() before p.
iterator insert(const iterator& p, size_t n, charT c) Sequence Inserts n copies of c before p.
iterator insert(const iterator& p, const charT* s) rope Inserts a C string before p.
iterator insert(const iterator& p, const charT* s, size_t n) rope Inserts a (not necessarily null-terminated) array of charT before p.
iterator insert(const iterator& p, const charT* f, const char* l) Sequence Inserts the range [f, l) before p.
iterator insert(const iterator& p, 
                const const_iterator& f, const const_iterator& l)
Sequence Inserts the range [f, l) before p.
iterator insert(const iterator& p, 
                const iterator& f, const iterator& l)
Sequence Inserts the range [f, l) before p.
void insert(size_t i, const rope& x) rope Inserts the contents of x before the ith element.
void insert(size_t i, charT c) rope Inserts the character c before the ith element.
void insert(size_t i) rope Inserts the character charT() before the ith element.
void insert(size_t i, size_t n, charT c) rope Inserts n copies of c before the ith element.
void insert(size_t i, const charT* s) rope Inserts a C string before the ith element.
void insert(size_t i, const charT* s, size_t n) rope Inserts a (not necessarily null-terminated) array of charT before the ith element.
void insert(size_t i, const charT* f, const charT* l) rope Inserts the range [f, l) before the ith element.
void insert(size_t i, 
            const const_iterator& f, const const_iterator& l)
rope Inserts the range [f, l) before the ith element.
void insert(size_t i, 
            const iterator& f, const iterator& l)
rope Inserts the range [f, l) before the ith element.
void erase(const iterator& p) Sequence Erases the element pointed to by p.
void erase(const iterator& f, const iterator& l) Sequence Erases the range [f, l).
void erase(size_t i, size_t n) rope Erases n elements, starting with the ith element.
append(const charT* s) rope Appends a C string.
append(const charT* s, size_t) rope Appends a (not necessarily null-terminated) array of charT.
append(const charT* f, const charT* l) rope Appends a range.
append(charT c) rope Appends the character c.
append() rope Appends the character charT().
append(size_t n, charT c) rope Appends n copies of c.
append(const rope& x) rope Appends the rope x.
void replace(const iterator& f, const iterator& l, const rope&) rope See below.
void replace(const iterator& f, const iterator& l, charT) rope See below.
void replace(const iterator& f, const iterator& l, const charT* s) rope See below.
void replace(const iterator& f, const iterator& l, 
             const charT* s, size_t n)
rope See below.
void replace(const iterator& f1, const iterator& l1, 
             const charT* f2, const charT* l2)
rope See below.
void replace(const iterator& f1, const iterator& l1, 
             const const_iterator& f2, const const_iterator& l2)
rope See below.
void replace(const iterator& f1, const iterator& l1, 
             const iterator& f2, const iterator& l2)
rope See below.
void replace(const iterator& p, const rope& x) rope See below.
void replace(const iterator& p, charT c) rope See below.
void replace(const iterator& p, const charT* s) rope See below.
void replace(const iterator& p, const charT* s, size_t n) rope See below.
void replace(const iterator& p, 
             const charT* f, const charT* l)
rope See below.
void replace(const iterator& p, 
             const_iterator f, const_iterator l)
rope See below.
void replace(const iterator& p, 
             iterator f, iterator l)
rope See below.
void replace(size_t i, size_t n, const rope& x) rope See below.
void replace(size_t i, size_t n, const charT* s, size_t n) rope See below.
void replace(size_t i, size_t n, charT c) rope See below.
void replace(size_t i, size_t n, const charT* s) rope See below.
void replace(size_t i, size_t n, 
             const charT* f, const charT* l)
rope See below.
void replace(size_t i, size_t n, 
             const const_iterator& f,
             const const_iterator& l)
rope See below.
void replace(size_t i, size_t n, 
             const iterator& f,
             const iterator& l)
rope See below.
void replace(size_t i, charT c) rope See below.
void replace(size_t i, const rope& x) rope See below.
void replace(size_t i, const charT* s) rope See below.
void replace(size_t i, const charT* s, size_t n) rope See below.
void replace(size_t i, const charT* f, const charT* l) rope See below.
void replace(size_t i, 
             const const_iterator& f, const const_iterator& l)
rope See below.
void replace(size_t i, 
             const iterator& f, const iterator& l)
rope See below.
rope substr(iterator f) const rope See below.
rope substr(const_iterator f) const rope See below.
rope substr(iterator f, iterator l) const rope See below.
rope substr(const_iterator f, const_iterator l) const rope See below.
rope substr(size_t i, size_t n = 1) const rope See below.
void copy(charT* buf) const rope Copies a rope into an array of charT.
size_type copy(size_type pos, size_type n, 
               charT* buf)
rope Copies a rope into an array of charT.
const charT* c_str() const rope See below.
void delete_c_str() rope See below.
rope operator+(const rope& L, const rope&R) rope Concatenates L and R. This is a global function, not a member function.
rope& operator+=(rope& L, const rope& R) rope Appends R to L. This is a global function, not a member function.
rope operator+(const rope& L, const charT* s) rope Concatenates L and s. This is a global function, not a member function.
rope& operator+=(rope& L, const charT* s) rope Appends s to L. This is a global function, not a member function.
rope operator+(const rope& L, charT c) rope Concatenates L and c. This is a global function, not a member function.
rope& operator+=(rope& L, charT c) rope Appends c to L. This is a global function, not a member function.
bool operator<(const rope&, const rope&) ForwardContainer Lexicographical comparison. This is a global function, not a member function.
bool operator==(const rope&, const rope*) ForwardContainer Tests two ropes for equality. This is a global function, not a member function.
ostream& operator<<(ostream& os, rope x) rope Outputs x to the stream os. This is a global function, not a member function.

New members

These members are not defined in the RandomAccessContainer requirements, but are specific to rope:

Function Description
rope(const charT* s) Constructs a rope from a C string. The rope consists of the sequence of characters starting with *s up to, but not including, the first null character.
rope(const charT* s, size_t n) Constructs a rope from an array of charT. The rope consists of the characters in the range [s, s + n). Note that this range is permitted to contain embedded null characters.
rope(charT c) Constructs a rope consisting of the single character c.
rope(char_producer<charT>* cp, size_t n, bool destroy) Constructs a rope of size n, whose characters are computed as needed by cp. The object *cp must be valid as long as any reference to the resulting rope, or a rope derived from it, may be used. If destroy is true, then delete cp will be executed automatically once cp is no longer needed. Typically destroy will be true unless cp is a pointer to statically allocated storage. It is rarely safe to allocate *cp on the stack.
size_type length() const Synonym for size
iterator mutable_begin() Returns an iterator pointing to the beginning of the rope. This member function exists because mutable rope iterators are much more expensive than constant rope iterators.
iterator mutable_end() Returns an iterator pointing to the end of the rope. This member function exists because mutable rope iterators are much more expensive than constant rope iterators.
iterator mutable_rbegin() Returns a reverse_iterator pointing to the beginning of the reversed rope. This member function exists because mutable rope iterators are much more expensive than constant rope iterators.
iterator mutable_rend() Returns a reverse_iterator pointing to the end of the reversed rope. This member function exists because mutable rope iterators are much more expensive than constant rope iterators.
reference mutable_reference_at(size_type n) Returns a reference to the nth element. This member function exists because mutable references to rope elements have fairly high overhead.
int compare(const rope& x) Three-way comparison, much like the function strcmp from the standard C library. Returns a negative number if *this is lexicographically less than x, a positive number if *this is lexicographically greater than x, and zero if neither rope is lexicographically less than the other.
iterator insert(const iterator& p, const rope& x) Inserts the contents of the rope x immediately before the position p.
iterator insert(const iterator& p, const charT* s) Inserts a C string immediately before the position p. The elements that are inserted are the sequence of characters starting with *s and up to, but not including, the first null character.
iterator insert(const iterator& p, const charT* s, size_t n) Inserts an array of charT. The elements that are inserted are the range [s, s + n). Note that this range is permitted to contain embedded null characters.
void insert(size_t i, const rope& x) Inserts the contents of the rope x immediately before the ith element.
void insert(size_t i, size_t n, charT c) Inserts n copies of c immediately before the ith element.
void insert(size_t i, const charT* s) Inserts a C string immediately before the ith element. The elements that are inserted are the sequence of characters starting with *s and up to, but not including, the first null character.
void insert(size_t i, const charT* s, size_t n) Inserts an array of charT immediately before the ith element. The elements that are inserted are the range [s, s + n). Note that this range is permitted to contain embedded null characters.
void insert(size_t i, charT c) Inserts the character c immediately before the ith element.
void insert(size_t i) Inserts the character charT() immediately before the ith element.
void insert(size_t i, const charT* f, const charT* l) Inserts the range [f, l) immediately before the ith element.
void insert(size_t i, 
            const const_iterator& f, const const_iterator& l)
Inserts the range [f, l) immediately before the ith element.
void insert(size_t i, 
            const iterator& f, const iterator& l)
Inserts the range [f, l) immediately before the ith element.
void erase(size_t i, size_t n) Erases n elements, starting with the ith element.
append(const charT* s) Adds a C string to the end of the rope. The elements that are inserted are the sequence of characters starting with *s and up to, but not including, the first null character.
append(const charT* s, size_ nt) Adds an array of charT to the end of the rope. The elements that are inserted are the range [s, s + n). Note that this range is permitted to contain embedded null characters.
append(const charT* f, const charT* l) Adds the elements in the range [f, l) to the end of the rope.
append(charT c) Adds the character c to the end of the rope.
append() Adds the character charT() to the end of the rope.
append(const rope& x) Adds the contents of the rope x to the end of *this.
append(size_t n, charT c) Adds n copies of c to the end of *this.
void replace(const iterator& f, const iterator& l, const rope& x) Replaces the elements in the range [f, l) with the elements in x.
void replace(const iterator& f, const iterator& l, charT c) Replaces the elements in the range [f, l) with the single character c.
void replace(const iterator& f, const iterator& l, const charT* s) Replaces the elements in the range [f, l) with a C string: the sequence of characters beginning with *s and up to, but not including, the first null character.
void replace(const iterator& f, const iterator& l, 
             const charT* s, size_t n)
Replaces the elements in the range [f, l) with the elements in the range [s, s + n).
void replace(const iterator& f1, const iterator& l1, 
             const charT* f2, const charT* l2)
Replaces the elements in the range [f1, l1) with the elements in the range [f2, l2).
void replace(const iterator& f1, const iterator& l1, 
             const const_iterator& f2, const const_iterator& l2)
Replaces the elements in the range [f1, l1) with the elements in the range [f2, l2).
void replace(const iterator& f1, const iterator& l1, 
             const iterator& f2, const iterator& l2)
Replaces the elements in the range [f1, l1) with the elements in the range [f2, l2).
void replace(const iterator& p, const rope& x) Replaces the element pointed to by p with the elements in x.
void replace(const iterator& p, charT c) Replaces the element pointed to by p with the single character c.
void replace(const iterator& p, const charT* s) Replaces the element pointed to by p with a C string: the sequence of characters beginning with *s and up to, but not including, the first null character.
void replace(const iterator& p, const charT* s, size_t n) Replaces the element pointed to by p with the elements in the range [s, s + n).
void replace(const iterator& p, 
             const charT* f, const charT* l)
Replaces the element pointed to by p with the elements in the range [f, l).
void replace(const iterator& p, 
             const_iterator f, const_iterator l)
Replaces the element pointed to by p with the elements in the range [f, l).
void replace(const iterator& p, 
             iterator f, iterator l)
Replaces the element pointed to by p with the elements in the range [f, l).
void replace(size_t i, size_t n, const rope& x) Replaces the n elements beginning with the ith element with the elements in x.
void replace(size_t i, size_t n, charT c) Replaces the n elements beginning with the ith element with the single character c.
void replace(size_t i, size_t n, const charT* s) Replaces the n elements beginning with the ith element with an array of charT: the sequence of characters beginning with *s and up to, but not including, the first null character.
void replace(size_t i, size_t n1, const charT* s, size_t n2) Replaces the n1 elements beginning with the ith element with the elements in the range [s, s + n2).
void replace(size_t i, size_t n, 
             const charT* f, const charT* l)
Replaces the n elements beginning with the ith element with the characters in the range [f, l).
void replace(size_t i, size_t n, 
             const const_iterator& f,
             const const_iterator& l)
Replaces the n elements beginning with the ith element with the characters in the range [f, l).
void replace(size_t i, size_t n, 
             const iterator& f,
             const iterator& l)
Replaces the n elements beginning with the ith element with the characters in the range [f, l).
void replace(size_t i, charT c) Replaces the ith element with the character c.
void replace(size_t i, const rope& x) Replaces the ith element with elements from the rope x.
void replace(size_t i, const charT* s) Replaces the ith element with a C string: the sequence of characters beginning with *s and up to, but not including, the first null character.
void replace(size_t i, const charT* s, size_t n) Replaces the ith element with the elements in the range [s, s + n).
void replace(size_t i, const charT* f, const charT* l) Replaces the ith element with the range [f, l).
void replace(size_t i, 
             const const_iterator& f, const const_iterator& l)
Replaces the ith element with the range [f, l).
void replace(size_t i, 
             const iterator& f, const iterator& l)
Replaces the ith element with the range [f, l).
rope substr(iterator f) const Returns a new rope with a single element, *f. [4]
rope substr(const_iterator f) const Returns a new rope with a single element, *f. [4]
rope substr(iterator f, iterator l) const Returns a new rope that consists of the range [f, l). [4]
rope substr(const_iterator f, const_iterator l) const Returns a new rope that consists of the range [f, l). [4]
rope substr(size_t i, size_t n = 1) const Returns a new rope whose elements are the n characters starting at the position i. [4].
void copy(charT* buf) const Copies the characters in a rope into buf.
size_type copy(size_type pos, size_type n, 
               charT* buf)
Copies n characters, starting at position pos in the rope, into buf. If the rope contains fewer than pos + n characters, then instead it only copies size() - pos characters.
const charT* c_str() const Returns a pointer to a null-terminated sequence of characters that contains all of the characters in a rope. [5] [6] The resulting sequence of characters is valid at least as long as the rope remains valid and unchanged. Note that the first invocation of this operation on long strings is slow: it is linear in the length of the rope.
void delete_c_str() Reclaims the internal storage used by c_str. Note that this invalidates the pointer that c_str returns.
rope operator+(const rope& L, const rope& R) Returns a new rope consisting of the concatenation of L and R. This is a global function, not a member function.
rope& operator+=(rope& L, const rope& R) Modifies L by appending R, and returns L. This is a global function, not a member function.
rope operator+(const rope& L, const charT* s) Returns a new rope consisting of the concatenation of L and all of the characters from s up to, but not including, the first null character. This is a global function, not a member function.
rope& operator+=(rope& L, const charT* s) Modifies L by appending the characters from s up to, but not including, the first null character. The return value is L. This is a global function, not a member function.
rope operator+(const rope& L, charT c) Returns a new rope consisting of L with the character c appended to it. This is a global function, not a member function.
rope& operator+=(rope& L, charT c) Modifies L by appending the character c. This is a global function, not a member function.
ostream& operator<<(ostream& os, rope x) Outputs x to the stream os. This is a global function, not a member function.

Notes

[1] For a detailed discussion of the rope data structure, see H.-J. Boehm, R. Atkinson, and M. Plass, "Ropes: An Alternative to Strings", Software Practice and Experience 25(12):1315, 1995.

[2] Since the value type is usually either char or wchar_t, the library introduces two abbreviations: crope is a typedef for rope<char>, and wrope is a typedef for rope<wchar_t>.

[3] Rope::reference is not value_type&, but a proxy type. In fact, reference is a typedef for the nested class charT_ref_proxy. Const_reference, however, is simply const value_type&. Similarly, const_pointer is just const value_type* but pointer is a proxy type. If r is an object of type reference, then &r is of type pointer.

[4] Note that the return value of substr is conceptually a distinct rope: the two ropes may share storage, but this is a hidden implementation detail. If you modify a rope returned by substr, this will not change the value of the original rope.

[5] The final const qualifier in the member function c_str() is conceptually slightly inaccurate in the interest of conformance to the basic_string interface in the draft C++ standard; the rope is updated to cache the converted string.

[6] Concurrent calls to c_str() are allowed; the cache is updated atomically.

See also

RandomAccessContainer, Sequence, Vector, sequence_buffer

Copyright © 2006-2007 Adobe Systems Incorporated.

Use of this website signifies your agreement to the Terms of Use and Online Privacy Policy.

Search powered by Google