How to create a numpy array of arbitrary length strings?Can't append 2 strings the way I wantHow to assign a string value to an array in numpy?H5py store list of list of stringsh5py: Store list of list of stringsCSV via NumPy genfromtxt(): dtype for variable string sizesimplementing efficient fixed size FIFO in pythonIs there a way to return an ndarray of arbitrary length strings and dtype PyObject?Assigning “list of directory” array on numpyCreate ArrayList from arrayHow do I iterate over the words of a string?How do I check if an array includes an object in JavaScript?How do I read / convert an InputStream into a String in Java?How to append something to an array?Creating multiline strings in JavaScriptHow do I make the first letter of a string uppercase in JavaScript?How to replace all occurrences of a string in JavaScriptHow to check whether a string contains a substring in JavaScript?How do I remove a particular element from an array in JavaScript?

A variation to the phrase "hanging over my shoulders"

Why is it that I can sometimes guess the next note?

What's the name of the logical fallacy where a debater extends a statement far beyond the original statement to make it true?

Is it allowed to activate the ability of multiple planeswalkers in a single turn?

Does Doodling or Improvising on the Piano Have Any Benefits?

Is my low blitz game drawing rate at www.chess.com an indicator that I am weak in chess?

Why is the Sun approximated as a black body at ~ 5800 K?

A Trivial Diagnosis

How to make money from a browser who sees 5 seconds into the future of any web page?

Biological Blimps: Propulsion

Has the laser at Magurele, Romania reached a tenth of the Sun's power?

Why can't the Brexit deadlock in the UK parliament be solved with a plurality vote?

US tourist/student visa

Non-trope happy ending?

How to get directions in deep space?

Permission on Database

Can you use Vicious Mockery to win an argument or gain favours?

Will number of steps recorded on FitBit/any fitness tracker add up distance in PokemonGo?

How do I tell my boss that I'm quitting soon, especially given that a colleague just left this week

Change the color of a single dot in `ddot` symbol

Does the Linux kernel need a file system to run?

Why does this expression simplify as such?

How much theory knowledge is actually used while playing?

How would you translate "more" for use as an interface button?



How to create a numpy array of arbitrary length strings?


Can't append 2 strings the way I wantHow to assign a string value to an array in numpy?H5py store list of list of stringsh5py: Store list of list of stringsCSV via NumPy genfromtxt(): dtype for variable string sizesimplementing efficient fixed size FIFO in pythonIs there a way to return an ndarray of arbitrary length strings and dtype PyObject?Assigning “list of directory” array on numpyCreate ArrayList from arrayHow do I iterate over the words of a string?How do I check if an array includes an object in JavaScript?How do I read / convert an InputStream into a String in Java?How to append something to an array?Creating multiline strings in JavaScriptHow do I make the first letter of a string uppercase in JavaScript?How to replace all occurrences of a string in JavaScriptHow to check whether a string contains a substring in JavaScript?How do I remove a particular element from an array in JavaScript?













45















I'm a complete rookie to Python, but it seems like a given string is able to be (effectively) arbitrary length. i.e. you can take a string str and keeping adding to it: str += "some stuff...". Is there a way to make an array of such strings?



When I try this, each element only stores a single character



strArr = numpy.empty(10, dtype='string')
for i in range(0,10)
strArr[i] = "test"


On the other hand, I know I can initialize an array of certain length strings, i.e.



strArr = numpy.empty(10, dtype='s256')


which can store 10 strings of up to 256 characters.










share|improve this question




























    45















    I'm a complete rookie to Python, but it seems like a given string is able to be (effectively) arbitrary length. i.e. you can take a string str and keeping adding to it: str += "some stuff...". Is there a way to make an array of such strings?



    When I try this, each element only stores a single character



    strArr = numpy.empty(10, dtype='string')
    for i in range(0,10)
    strArr[i] = "test"


    On the other hand, I know I can initialize an array of certain length strings, i.e.



    strArr = numpy.empty(10, dtype='s256')


    which can store 10 strings of up to 256 characters.










    share|improve this question


























      45












      45








      45


      7






      I'm a complete rookie to Python, but it seems like a given string is able to be (effectively) arbitrary length. i.e. you can take a string str and keeping adding to it: str += "some stuff...". Is there a way to make an array of such strings?



      When I try this, each element only stores a single character



      strArr = numpy.empty(10, dtype='string')
      for i in range(0,10)
      strArr[i] = "test"


      On the other hand, I know I can initialize an array of certain length strings, i.e.



      strArr = numpy.empty(10, dtype='s256')


      which can store 10 strings of up to 256 characters.










      share|improve this question
















      I'm a complete rookie to Python, but it seems like a given string is able to be (effectively) arbitrary length. i.e. you can take a string str and keeping adding to it: str += "some stuff...". Is there a way to make an array of such strings?



      When I try this, each element only stores a single character



      strArr = numpy.empty(10, dtype='string')
      for i in range(0,10)
      strArr[i] = "test"


      On the other hand, I know I can initialize an array of certain length strings, i.e.



      strArr = numpy.empty(10, dtype='s256')


      which can store 10 strings of up to 256 characters.







      python arrays string numpy






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 7 at 6:36









      martineau

      69.3k1092186




      69.3k1092186










      asked Feb 1 '13 at 3:58









      DilithiumMatrixDilithiumMatrix

      7,48495490




      7,48495490






















          2 Answers
          2






          active

          oldest

          votes


















          86














          You can do so by creating an array of dtype=object. If you try to assign a long string to a normal numpy array, it truncates the string:



          >>> a = numpy.array(['apples', 'foobar', 'cowboy'])
          >>> a[2] = 'bananas'
          >>> a
          array(['apples', 'foobar', 'banana'],
          dtype='|S6')


          But when you use dtype=object, you get an array of python object references. So you can have all the behaviors of python strings:



          >>> a = numpy.array(['apples', 'foobar', 'cowboy'], dtype=object)
          >>> a
          array([apples, foobar, cowboy], dtype=object)
          >>> a[2] = 'bananas'
          >>> a
          array([apples, foobar, bananas], dtype=object)


          Indeed, because it's an array of objects, you can assign any kind of python object to the array:



          >>> a[2] = 1:2, 3:4
          >>> a
          array([apples, foobar, 1: 2, 3: 4], dtype=object)


          However, this undoes a lot of the benefits of using numpy, which is so fast because it works on large contiguous blocks of raw memory. Working with python objects adds a lot of overhead. A simple example:



          >>> a = numpy.array(['abba' for _ in range(10000)])
          >>> b = numpy.array(['abba' for _ in range(10000)], dtype=object)
          >>> %timeit a.copy()
          100000 loops, best of 3: 2.51 us per loop
          >>> %timeit b.copy()
          10000 loops, best of 3: 48.4 us per loop





          share|improve this answer























          • Thanks, your first example is especially helpful--I never would have guessed that behavior! I'm not worried about the speed for this object, so slower access should be fine.

            – DilithiumMatrix
            Feb 1 '13 at 4:25






          • 1





            Nice answer. I've incorporated a link to it with demo into a python notebook page I'm working on about numpy array creation.

            – John Lockwood
            Mar 20 '15 at 13:15











          • @senderle if an array is with object dtype, then np.fromstring(arr.tostring()) will fail with numpy Cannot create an object array from a string. Any ideas to solve this?

            – youkaichao
            Jul 31 '18 at 13:58











          • @游凯超 hmm. That's a tough one. It's not a total surprise because numpy really isn't designed to work with python objects. It's more of a shortcut than a proper use of numpy. So there's no real reason for them to support corner cases like that. My approach would probably be to get the maximum length of string and use a standard fixed-width char array.

            – senderle
            Aug 6 '18 at 22:36












          • @游凯超 if your goal is to use strings as row labels or column headers you should also look into structured arrays.

            – senderle
            Aug 6 '18 at 22:43


















          11














          You could use the object data type:



          >>> import numpy
          >>> s = numpy.array(['a', 'b', 'dude'], dtype='object')
          >>> s[0] += 'bcdef'
          >>> s
          array([abcdef, b, dude], dtype=object)





          share|improve this answer






















            Your Answer






            StackExchange.ifUsing("editor", function ()
            StackExchange.using("externalEditor", function ()
            StackExchange.using("snippets", function ()
            StackExchange.snippets.init();
            );
            );
            , "code-snippets");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "1"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f14639496%2fhow-to-create-a-numpy-array-of-arbitrary-length-strings%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            86














            You can do so by creating an array of dtype=object. If you try to assign a long string to a normal numpy array, it truncates the string:



            >>> a = numpy.array(['apples', 'foobar', 'cowboy'])
            >>> a[2] = 'bananas'
            >>> a
            array(['apples', 'foobar', 'banana'],
            dtype='|S6')


            But when you use dtype=object, you get an array of python object references. So you can have all the behaviors of python strings:



            >>> a = numpy.array(['apples', 'foobar', 'cowboy'], dtype=object)
            >>> a
            array([apples, foobar, cowboy], dtype=object)
            >>> a[2] = 'bananas'
            >>> a
            array([apples, foobar, bananas], dtype=object)


            Indeed, because it's an array of objects, you can assign any kind of python object to the array:



            >>> a[2] = 1:2, 3:4
            >>> a
            array([apples, foobar, 1: 2, 3: 4], dtype=object)


            However, this undoes a lot of the benefits of using numpy, which is so fast because it works on large contiguous blocks of raw memory. Working with python objects adds a lot of overhead. A simple example:



            >>> a = numpy.array(['abba' for _ in range(10000)])
            >>> b = numpy.array(['abba' for _ in range(10000)], dtype=object)
            >>> %timeit a.copy()
            100000 loops, best of 3: 2.51 us per loop
            >>> %timeit b.copy()
            10000 loops, best of 3: 48.4 us per loop





            share|improve this answer























            • Thanks, your first example is especially helpful--I never would have guessed that behavior! I'm not worried about the speed for this object, so slower access should be fine.

              – DilithiumMatrix
              Feb 1 '13 at 4:25






            • 1





              Nice answer. I've incorporated a link to it with demo into a python notebook page I'm working on about numpy array creation.

              – John Lockwood
              Mar 20 '15 at 13:15











            • @senderle if an array is with object dtype, then np.fromstring(arr.tostring()) will fail with numpy Cannot create an object array from a string. Any ideas to solve this?

              – youkaichao
              Jul 31 '18 at 13:58











            • @游凯超 hmm. That's a tough one. It's not a total surprise because numpy really isn't designed to work with python objects. It's more of a shortcut than a proper use of numpy. So there's no real reason for them to support corner cases like that. My approach would probably be to get the maximum length of string and use a standard fixed-width char array.

              – senderle
              Aug 6 '18 at 22:36












            • @游凯超 if your goal is to use strings as row labels or column headers you should also look into structured arrays.

              – senderle
              Aug 6 '18 at 22:43















            86














            You can do so by creating an array of dtype=object. If you try to assign a long string to a normal numpy array, it truncates the string:



            >>> a = numpy.array(['apples', 'foobar', 'cowboy'])
            >>> a[2] = 'bananas'
            >>> a
            array(['apples', 'foobar', 'banana'],
            dtype='|S6')


            But when you use dtype=object, you get an array of python object references. So you can have all the behaviors of python strings:



            >>> a = numpy.array(['apples', 'foobar', 'cowboy'], dtype=object)
            >>> a
            array([apples, foobar, cowboy], dtype=object)
            >>> a[2] = 'bananas'
            >>> a
            array([apples, foobar, bananas], dtype=object)


            Indeed, because it's an array of objects, you can assign any kind of python object to the array:



            >>> a[2] = 1:2, 3:4
            >>> a
            array([apples, foobar, 1: 2, 3: 4], dtype=object)


            However, this undoes a lot of the benefits of using numpy, which is so fast because it works on large contiguous blocks of raw memory. Working with python objects adds a lot of overhead. A simple example:



            >>> a = numpy.array(['abba' for _ in range(10000)])
            >>> b = numpy.array(['abba' for _ in range(10000)], dtype=object)
            >>> %timeit a.copy()
            100000 loops, best of 3: 2.51 us per loop
            >>> %timeit b.copy()
            10000 loops, best of 3: 48.4 us per loop





            share|improve this answer























            • Thanks, your first example is especially helpful--I never would have guessed that behavior! I'm not worried about the speed for this object, so slower access should be fine.

              – DilithiumMatrix
              Feb 1 '13 at 4:25






            • 1





              Nice answer. I've incorporated a link to it with demo into a python notebook page I'm working on about numpy array creation.

              – John Lockwood
              Mar 20 '15 at 13:15











            • @senderle if an array is with object dtype, then np.fromstring(arr.tostring()) will fail with numpy Cannot create an object array from a string. Any ideas to solve this?

              – youkaichao
              Jul 31 '18 at 13:58











            • @游凯超 hmm. That's a tough one. It's not a total surprise because numpy really isn't designed to work with python objects. It's more of a shortcut than a proper use of numpy. So there's no real reason for them to support corner cases like that. My approach would probably be to get the maximum length of string and use a standard fixed-width char array.

              – senderle
              Aug 6 '18 at 22:36












            • @游凯超 if your goal is to use strings as row labels or column headers you should also look into structured arrays.

              – senderle
              Aug 6 '18 at 22:43













            86












            86








            86







            You can do so by creating an array of dtype=object. If you try to assign a long string to a normal numpy array, it truncates the string:



            >>> a = numpy.array(['apples', 'foobar', 'cowboy'])
            >>> a[2] = 'bananas'
            >>> a
            array(['apples', 'foobar', 'banana'],
            dtype='|S6')


            But when you use dtype=object, you get an array of python object references. So you can have all the behaviors of python strings:



            >>> a = numpy.array(['apples', 'foobar', 'cowboy'], dtype=object)
            >>> a
            array([apples, foobar, cowboy], dtype=object)
            >>> a[2] = 'bananas'
            >>> a
            array([apples, foobar, bananas], dtype=object)


            Indeed, because it's an array of objects, you can assign any kind of python object to the array:



            >>> a[2] = 1:2, 3:4
            >>> a
            array([apples, foobar, 1: 2, 3: 4], dtype=object)


            However, this undoes a lot of the benefits of using numpy, which is so fast because it works on large contiguous blocks of raw memory. Working with python objects adds a lot of overhead. A simple example:



            >>> a = numpy.array(['abba' for _ in range(10000)])
            >>> b = numpy.array(['abba' for _ in range(10000)], dtype=object)
            >>> %timeit a.copy()
            100000 loops, best of 3: 2.51 us per loop
            >>> %timeit b.copy()
            10000 loops, best of 3: 48.4 us per loop





            share|improve this answer













            You can do so by creating an array of dtype=object. If you try to assign a long string to a normal numpy array, it truncates the string:



            >>> a = numpy.array(['apples', 'foobar', 'cowboy'])
            >>> a[2] = 'bananas'
            >>> a
            array(['apples', 'foobar', 'banana'],
            dtype='|S6')


            But when you use dtype=object, you get an array of python object references. So you can have all the behaviors of python strings:



            >>> a = numpy.array(['apples', 'foobar', 'cowboy'], dtype=object)
            >>> a
            array([apples, foobar, cowboy], dtype=object)
            >>> a[2] = 'bananas'
            >>> a
            array([apples, foobar, bananas], dtype=object)


            Indeed, because it's an array of objects, you can assign any kind of python object to the array:



            >>> a[2] = 1:2, 3:4
            >>> a
            array([apples, foobar, 1: 2, 3: 4], dtype=object)


            However, this undoes a lot of the benefits of using numpy, which is so fast because it works on large contiguous blocks of raw memory. Working with python objects adds a lot of overhead. A simple example:



            >>> a = numpy.array(['abba' for _ in range(10000)])
            >>> b = numpy.array(['abba' for _ in range(10000)], dtype=object)
            >>> %timeit a.copy()
            100000 loops, best of 3: 2.51 us per loop
            >>> %timeit b.copy()
            10000 loops, best of 3: 48.4 us per loop






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Feb 1 '13 at 4:07









            senderlesenderle

            94.2k21170193




            94.2k21170193












            • Thanks, your first example is especially helpful--I never would have guessed that behavior! I'm not worried about the speed for this object, so slower access should be fine.

              – DilithiumMatrix
              Feb 1 '13 at 4:25






            • 1





              Nice answer. I've incorporated a link to it with demo into a python notebook page I'm working on about numpy array creation.

              – John Lockwood
              Mar 20 '15 at 13:15











            • @senderle if an array is with object dtype, then np.fromstring(arr.tostring()) will fail with numpy Cannot create an object array from a string. Any ideas to solve this?

              – youkaichao
              Jul 31 '18 at 13:58











            • @游凯超 hmm. That's a tough one. It's not a total surprise because numpy really isn't designed to work with python objects. It's more of a shortcut than a proper use of numpy. So there's no real reason for them to support corner cases like that. My approach would probably be to get the maximum length of string and use a standard fixed-width char array.

              – senderle
              Aug 6 '18 at 22:36












            • @游凯超 if your goal is to use strings as row labels or column headers you should also look into structured arrays.

              – senderle
              Aug 6 '18 at 22:43

















            • Thanks, your first example is especially helpful--I never would have guessed that behavior! I'm not worried about the speed for this object, so slower access should be fine.

              – DilithiumMatrix
              Feb 1 '13 at 4:25






            • 1





              Nice answer. I've incorporated a link to it with demo into a python notebook page I'm working on about numpy array creation.

              – John Lockwood
              Mar 20 '15 at 13:15











            • @senderle if an array is with object dtype, then np.fromstring(arr.tostring()) will fail with numpy Cannot create an object array from a string. Any ideas to solve this?

              – youkaichao
              Jul 31 '18 at 13:58











            • @游凯超 hmm. That's a tough one. It's not a total surprise because numpy really isn't designed to work with python objects. It's more of a shortcut than a proper use of numpy. So there's no real reason for them to support corner cases like that. My approach would probably be to get the maximum length of string and use a standard fixed-width char array.

              – senderle
              Aug 6 '18 at 22:36












            • @游凯超 if your goal is to use strings as row labels or column headers you should also look into structured arrays.

              – senderle
              Aug 6 '18 at 22:43
















            Thanks, your first example is especially helpful--I never would have guessed that behavior! I'm not worried about the speed for this object, so slower access should be fine.

            – DilithiumMatrix
            Feb 1 '13 at 4:25





            Thanks, your first example is especially helpful--I never would have guessed that behavior! I'm not worried about the speed for this object, so slower access should be fine.

            – DilithiumMatrix
            Feb 1 '13 at 4:25




            1




            1





            Nice answer. I've incorporated a link to it with demo into a python notebook page I'm working on about numpy array creation.

            – John Lockwood
            Mar 20 '15 at 13:15





            Nice answer. I've incorporated a link to it with demo into a python notebook page I'm working on about numpy array creation.

            – John Lockwood
            Mar 20 '15 at 13:15













            @senderle if an array is with object dtype, then np.fromstring(arr.tostring()) will fail with numpy Cannot create an object array from a string. Any ideas to solve this?

            – youkaichao
            Jul 31 '18 at 13:58





            @senderle if an array is with object dtype, then np.fromstring(arr.tostring()) will fail with numpy Cannot create an object array from a string. Any ideas to solve this?

            – youkaichao
            Jul 31 '18 at 13:58













            @游凯超 hmm. That's a tough one. It's not a total surprise because numpy really isn't designed to work with python objects. It's more of a shortcut than a proper use of numpy. So there's no real reason for them to support corner cases like that. My approach would probably be to get the maximum length of string and use a standard fixed-width char array.

            – senderle
            Aug 6 '18 at 22:36






            @游凯超 hmm. That's a tough one. It's not a total surprise because numpy really isn't designed to work with python objects. It's more of a shortcut than a proper use of numpy. So there's no real reason for them to support corner cases like that. My approach would probably be to get the maximum length of string and use a standard fixed-width char array.

            – senderle
            Aug 6 '18 at 22:36














            @游凯超 if your goal is to use strings as row labels or column headers you should also look into structured arrays.

            – senderle
            Aug 6 '18 at 22:43





            @游凯超 if your goal is to use strings as row labels or column headers you should also look into structured arrays.

            – senderle
            Aug 6 '18 at 22:43













            11














            You could use the object data type:



            >>> import numpy
            >>> s = numpy.array(['a', 'b', 'dude'], dtype='object')
            >>> s[0] += 'bcdef'
            >>> s
            array([abcdef, b, dude], dtype=object)





            share|improve this answer



























              11














              You could use the object data type:



              >>> import numpy
              >>> s = numpy.array(['a', 'b', 'dude'], dtype='object')
              >>> s[0] += 'bcdef'
              >>> s
              array([abcdef, b, dude], dtype=object)





              share|improve this answer

























                11












                11








                11







                You could use the object data type:



                >>> import numpy
                >>> s = numpy.array(['a', 'b', 'dude'], dtype='object')
                >>> s[0] += 'bcdef'
                >>> s
                array([abcdef, b, dude], dtype=object)





                share|improve this answer













                You could use the object data type:



                >>> import numpy
                >>> s = numpy.array(['a', 'b', 'dude'], dtype='object')
                >>> s[0] += 'bcdef'
                >>> s
                array([abcdef, b, dude], dtype=object)






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Feb 1 '13 at 4:05









                jterracejterrace

                44.9k13120166




                44.9k13120166



























                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f14639496%2fhow-to-create-a-numpy-array-of-arbitrary-length-strings%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    1928 у кіно

                    Захаров Федір Захарович

                    Ель Греко