What is the best way to store large sequences of text in Python?

Recently, I discovered that my student was engaged in an independent project in which he used very large lines (2-4 MB) as values ​​in a dictionary.

I never had a reason to work with such large blocks of text, and I wondered if there were performance problems associated with creating such large lines.

Is there a better way to do this than just create a string? I understand that this question is highly contextual, but I am looking for generalized answers that can cover more than one possible use case.

If you work with such text, how would you save it in your code, and do you do nothing else than if you were just working with a regular string of several characters?

+4
source share
2 answers

It depends on what you do with the strings. I'm not quite sure how Python stores strings, but I worked a lot on XEmacs (similar to GNU Emacs) and on the base implementation of Emacs Lisp, which is a dynamic language like Python, and I know how strings are implemented there. Strings will be stored as memory blocks, similar to arrays. There is no big problem creating large arrays in Python, so I don’t think that just storing strings this way will cause performance problems. Some things to consider:

  • ? , , O (N ^ 2), . Java StringBuilder. , Python, , , ''.join(array).

  • ? , - . O (n) ; , O (n/m), m - , , . , . , , , , , .

  • ? , , - . - , , , , , . , , , , .

, , , , , O (...).

+1

, :

  • , ()?

  • ?
    , , Python , , . , .

, , . .

0

All Articles