How to create a numpy array of arbitrary length strings?Can't append 2 strings the way I wantHow to assign a string value to an array in numpy?H5py store list of list of stringsh5py: Store list of list of stringsCSV via NumPy genfromtxt(): dtype for variable string sizesimplementing efficient fixed size FIFO in pythonIs there a way to return an ndarray of arbitrary length strings and dtype PyObject?Assigning “list of directory” array on numpyCreate ArrayList from arrayHow do I iterate over the words of a string?How do I check if an array includes an object in JavaScript?How do I read / convert an InputStream into a String in Java?How to append something to an array?Creating multiline strings in JavaScriptHow do I make the first letter of a string uppercase in JavaScript?How to replace all occurrences of a string in JavaScriptHow to check whether a string contains a substring in JavaScript?How do I remove a particular element from an array in JavaScript?
A variation to the phrase "hanging over my shoulders"
Why is it that I can sometimes guess the next note?
What's the name of the logical fallacy where a debater extends a statement far beyond the original statement to make it true?
Is it allowed to activate the ability of multiple planeswalkers in a single turn?
Does Doodling or Improvising on the Piano Have Any Benefits?
Is my low blitz game drawing rate at www.chess.com an indicator that I am weak in chess?
Why is the Sun approximated as a black body at ~ 5800 K?
A Trivial Diagnosis
How to make money from a browser who sees 5 seconds into the future of any web page?
Biological Blimps: Propulsion
Has the laser at Magurele, Romania reached a tenth of the Sun's power?
Why can't the Brexit deadlock in the UK parliament be solved with a plurality vote?
US tourist/student visa
Non-trope happy ending?
How to get directions in deep space?
Permission on Database
Can you use Vicious Mockery to win an argument or gain favours?
Will number of steps recorded on FitBit/any fitness tracker add up distance in PokemonGo?
How do I tell my boss that I'm quitting soon, especially given that a colleague just left this week
Change the color of a single dot in `ddot` symbol
Does the Linux kernel need a file system to run?
Why does this expression simplify as such?
How much theory knowledge is actually used while playing?
How would you translate "more" for use as an interface button?
How to create a numpy array of arbitrary length strings?
Can't append 2 strings the way I wantHow to assign a string value to an array in numpy?H5py store list of list of stringsh5py: Store list of list of stringsCSV via NumPy genfromtxt(): dtype for variable string sizesimplementing efficient fixed size FIFO in pythonIs there a way to return an ndarray of arbitrary length strings and dtype PyObject?Assigning “list of directory” array on numpyCreate ArrayList from arrayHow do I iterate over the words of a string?How do I check if an array includes an object in JavaScript?How do I read / convert an InputStream into a String in Java?How to append something to an array?Creating multiline strings in JavaScriptHow do I make the first letter of a string uppercase in JavaScript?How to replace all occurrences of a string in JavaScriptHow to check whether a string contains a substring in JavaScript?How do I remove a particular element from an array in JavaScript?
I'm a complete rookie to Python, but it seems like a given string is able to be (effectively) arbitrary length. i.e. you can take a string str
and keeping adding to it: str += "some stuff..."
. Is there a way to make an array of such strings?
When I try this, each element only stores a single character
strArr = numpy.empty(10, dtype='string')
for i in range(0,10)
strArr[i] = "test"
On the other hand, I know I can initialize an array of certain length strings, i.e.
strArr = numpy.empty(10, dtype='s256')
which can store 10 strings of up to 256 characters.
python arrays string numpy
add a comment |
I'm a complete rookie to Python, but it seems like a given string is able to be (effectively) arbitrary length. i.e. you can take a string str
and keeping adding to it: str += "some stuff..."
. Is there a way to make an array of such strings?
When I try this, each element only stores a single character
strArr = numpy.empty(10, dtype='string')
for i in range(0,10)
strArr[i] = "test"
On the other hand, I know I can initialize an array of certain length strings, i.e.
strArr = numpy.empty(10, dtype='s256')
which can store 10 strings of up to 256 characters.
python arrays string numpy
add a comment |
I'm a complete rookie to Python, but it seems like a given string is able to be (effectively) arbitrary length. i.e. you can take a string str
and keeping adding to it: str += "some stuff..."
. Is there a way to make an array of such strings?
When I try this, each element only stores a single character
strArr = numpy.empty(10, dtype='string')
for i in range(0,10)
strArr[i] = "test"
On the other hand, I know I can initialize an array of certain length strings, i.e.
strArr = numpy.empty(10, dtype='s256')
which can store 10 strings of up to 256 characters.
python arrays string numpy
I'm a complete rookie to Python, but it seems like a given string is able to be (effectively) arbitrary length. i.e. you can take a string str
and keeping adding to it: str += "some stuff..."
. Is there a way to make an array of such strings?
When I try this, each element only stores a single character
strArr = numpy.empty(10, dtype='string')
for i in range(0,10)
strArr[i] = "test"
On the other hand, I know I can initialize an array of certain length strings, i.e.
strArr = numpy.empty(10, dtype='s256')
which can store 10 strings of up to 256 characters.
python arrays string numpy
python arrays string numpy
edited Mar 7 at 6:36
martineau
69.3k1092186
69.3k1092186
asked Feb 1 '13 at 3:58
DilithiumMatrixDilithiumMatrix
7,48495490
7,48495490
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
You can do so by creating an array of dtype=object
. If you try to assign a long string to a normal numpy array, it truncates the string:
>>> a = numpy.array(['apples', 'foobar', 'cowboy'])
>>> a[2] = 'bananas'
>>> a
array(['apples', 'foobar', 'banana'],
dtype='|S6')
But when you use dtype=object
, you get an array of python object references. So you can have all the behaviors of python strings:
>>> a = numpy.array(['apples', 'foobar', 'cowboy'], dtype=object)
>>> a
array([apples, foobar, cowboy], dtype=object)
>>> a[2] = 'bananas'
>>> a
array([apples, foobar, bananas], dtype=object)
Indeed, because it's an array of objects, you can assign any kind of python object to the array:
>>> a[2] = 1:2, 3:4
>>> a
array([apples, foobar, 1: 2, 3: 4], dtype=object)
However, this undoes a lot of the benefits of using numpy, which is so fast because it works on large contiguous blocks of raw memory. Working with python objects adds a lot of overhead. A simple example:
>>> a = numpy.array(['abba' for _ in range(10000)])
>>> b = numpy.array(['abba' for _ in range(10000)], dtype=object)
>>> %timeit a.copy()
100000 loops, best of 3: 2.51 us per loop
>>> %timeit b.copy()
10000 loops, best of 3: 48.4 us per loop
Thanks, your first example is especially helpful--I never would have guessed that behavior! I'm not worried about the speed for this object, so slower access should be fine.
– DilithiumMatrix
Feb 1 '13 at 4:25
1
Nice answer. I've incorporated a link to it with demo into a python notebook page I'm working on about numpy array creation.
– John Lockwood
Mar 20 '15 at 13:15
@senderle if an array is withobject
dtype, thennp.fromstring(arr.tostring())
will fail withnumpy Cannot create an object array from a string
. Any ideas to solve this?
– youkaichao
Jul 31 '18 at 13:58
@游凯超 hmm. That's a tough one. It's not a total surprise because numpy really isn't designed to work with python objects. It's more of a shortcut than a proper use of numpy. So there's no real reason for them to support corner cases like that. My approach would probably be to get the maximum length of string and use a standard fixed-width char array.
– senderle
Aug 6 '18 at 22:36
@游凯超 if your goal is to use strings as row labels or column headers you should also look into structured arrays.
– senderle
Aug 6 '18 at 22:43
add a comment |
You could use the object data type:
>>> import numpy
>>> s = numpy.array(['a', 'b', 'dude'], dtype='object')
>>> s[0] += 'bcdef'
>>> s
array([abcdef, b, dude], dtype=object)
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f14639496%2fhow-to-create-a-numpy-array-of-arbitrary-length-strings%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can do so by creating an array of dtype=object
. If you try to assign a long string to a normal numpy array, it truncates the string:
>>> a = numpy.array(['apples', 'foobar', 'cowboy'])
>>> a[2] = 'bananas'
>>> a
array(['apples', 'foobar', 'banana'],
dtype='|S6')
But when you use dtype=object
, you get an array of python object references. So you can have all the behaviors of python strings:
>>> a = numpy.array(['apples', 'foobar', 'cowboy'], dtype=object)
>>> a
array([apples, foobar, cowboy], dtype=object)
>>> a[2] = 'bananas'
>>> a
array([apples, foobar, bananas], dtype=object)
Indeed, because it's an array of objects, you can assign any kind of python object to the array:
>>> a[2] = 1:2, 3:4
>>> a
array([apples, foobar, 1: 2, 3: 4], dtype=object)
However, this undoes a lot of the benefits of using numpy, which is so fast because it works on large contiguous blocks of raw memory. Working with python objects adds a lot of overhead. A simple example:
>>> a = numpy.array(['abba' for _ in range(10000)])
>>> b = numpy.array(['abba' for _ in range(10000)], dtype=object)
>>> %timeit a.copy()
100000 loops, best of 3: 2.51 us per loop
>>> %timeit b.copy()
10000 loops, best of 3: 48.4 us per loop
Thanks, your first example is especially helpful--I never would have guessed that behavior! I'm not worried about the speed for this object, so slower access should be fine.
– DilithiumMatrix
Feb 1 '13 at 4:25
1
Nice answer. I've incorporated a link to it with demo into a python notebook page I'm working on about numpy array creation.
– John Lockwood
Mar 20 '15 at 13:15
@senderle if an array is withobject
dtype, thennp.fromstring(arr.tostring())
will fail withnumpy Cannot create an object array from a string
. Any ideas to solve this?
– youkaichao
Jul 31 '18 at 13:58
@游凯超 hmm. That's a tough one. It's not a total surprise because numpy really isn't designed to work with python objects. It's more of a shortcut than a proper use of numpy. So there's no real reason for them to support corner cases like that. My approach would probably be to get the maximum length of string and use a standard fixed-width char array.
– senderle
Aug 6 '18 at 22:36
@游凯超 if your goal is to use strings as row labels or column headers you should also look into structured arrays.
– senderle
Aug 6 '18 at 22:43
add a comment |
You can do so by creating an array of dtype=object
. If you try to assign a long string to a normal numpy array, it truncates the string:
>>> a = numpy.array(['apples', 'foobar', 'cowboy'])
>>> a[2] = 'bananas'
>>> a
array(['apples', 'foobar', 'banana'],
dtype='|S6')
But when you use dtype=object
, you get an array of python object references. So you can have all the behaviors of python strings:
>>> a = numpy.array(['apples', 'foobar', 'cowboy'], dtype=object)
>>> a
array([apples, foobar, cowboy], dtype=object)
>>> a[2] = 'bananas'
>>> a
array([apples, foobar, bananas], dtype=object)
Indeed, because it's an array of objects, you can assign any kind of python object to the array:
>>> a[2] = 1:2, 3:4
>>> a
array([apples, foobar, 1: 2, 3: 4], dtype=object)
However, this undoes a lot of the benefits of using numpy, which is so fast because it works on large contiguous blocks of raw memory. Working with python objects adds a lot of overhead. A simple example:
>>> a = numpy.array(['abba' for _ in range(10000)])
>>> b = numpy.array(['abba' for _ in range(10000)], dtype=object)
>>> %timeit a.copy()
100000 loops, best of 3: 2.51 us per loop
>>> %timeit b.copy()
10000 loops, best of 3: 48.4 us per loop
Thanks, your first example is especially helpful--I never would have guessed that behavior! I'm not worried about the speed for this object, so slower access should be fine.
– DilithiumMatrix
Feb 1 '13 at 4:25
1
Nice answer. I've incorporated a link to it with demo into a python notebook page I'm working on about numpy array creation.
– John Lockwood
Mar 20 '15 at 13:15
@senderle if an array is withobject
dtype, thennp.fromstring(arr.tostring())
will fail withnumpy Cannot create an object array from a string
. Any ideas to solve this?
– youkaichao
Jul 31 '18 at 13:58
@游凯超 hmm. That's a tough one. It's not a total surprise because numpy really isn't designed to work with python objects. It's more of a shortcut than a proper use of numpy. So there's no real reason for them to support corner cases like that. My approach would probably be to get the maximum length of string and use a standard fixed-width char array.
– senderle
Aug 6 '18 at 22:36
@游凯超 if your goal is to use strings as row labels or column headers you should also look into structured arrays.
– senderle
Aug 6 '18 at 22:43
add a comment |
You can do so by creating an array of dtype=object
. If you try to assign a long string to a normal numpy array, it truncates the string:
>>> a = numpy.array(['apples', 'foobar', 'cowboy'])
>>> a[2] = 'bananas'
>>> a
array(['apples', 'foobar', 'banana'],
dtype='|S6')
But when you use dtype=object
, you get an array of python object references. So you can have all the behaviors of python strings:
>>> a = numpy.array(['apples', 'foobar', 'cowboy'], dtype=object)
>>> a
array([apples, foobar, cowboy], dtype=object)
>>> a[2] = 'bananas'
>>> a
array([apples, foobar, bananas], dtype=object)
Indeed, because it's an array of objects, you can assign any kind of python object to the array:
>>> a[2] = 1:2, 3:4
>>> a
array([apples, foobar, 1: 2, 3: 4], dtype=object)
However, this undoes a lot of the benefits of using numpy, which is so fast because it works on large contiguous blocks of raw memory. Working with python objects adds a lot of overhead. A simple example:
>>> a = numpy.array(['abba' for _ in range(10000)])
>>> b = numpy.array(['abba' for _ in range(10000)], dtype=object)
>>> %timeit a.copy()
100000 loops, best of 3: 2.51 us per loop
>>> %timeit b.copy()
10000 loops, best of 3: 48.4 us per loop
You can do so by creating an array of dtype=object
. If you try to assign a long string to a normal numpy array, it truncates the string:
>>> a = numpy.array(['apples', 'foobar', 'cowboy'])
>>> a[2] = 'bananas'
>>> a
array(['apples', 'foobar', 'banana'],
dtype='|S6')
But when you use dtype=object
, you get an array of python object references. So you can have all the behaviors of python strings:
>>> a = numpy.array(['apples', 'foobar', 'cowboy'], dtype=object)
>>> a
array([apples, foobar, cowboy], dtype=object)
>>> a[2] = 'bananas'
>>> a
array([apples, foobar, bananas], dtype=object)
Indeed, because it's an array of objects, you can assign any kind of python object to the array:
>>> a[2] = 1:2, 3:4
>>> a
array([apples, foobar, 1: 2, 3: 4], dtype=object)
However, this undoes a lot of the benefits of using numpy, which is so fast because it works on large contiguous blocks of raw memory. Working with python objects adds a lot of overhead. A simple example:
>>> a = numpy.array(['abba' for _ in range(10000)])
>>> b = numpy.array(['abba' for _ in range(10000)], dtype=object)
>>> %timeit a.copy()
100000 loops, best of 3: 2.51 us per loop
>>> %timeit b.copy()
10000 loops, best of 3: 48.4 us per loop
answered Feb 1 '13 at 4:07
senderlesenderle
94.2k21170193
94.2k21170193
Thanks, your first example is especially helpful--I never would have guessed that behavior! I'm not worried about the speed for this object, so slower access should be fine.
– DilithiumMatrix
Feb 1 '13 at 4:25
1
Nice answer. I've incorporated a link to it with demo into a python notebook page I'm working on about numpy array creation.
– John Lockwood
Mar 20 '15 at 13:15
@senderle if an array is withobject
dtype, thennp.fromstring(arr.tostring())
will fail withnumpy Cannot create an object array from a string
. Any ideas to solve this?
– youkaichao
Jul 31 '18 at 13:58
@游凯超 hmm. That's a tough one. It's not a total surprise because numpy really isn't designed to work with python objects. It's more of a shortcut than a proper use of numpy. So there's no real reason for them to support corner cases like that. My approach would probably be to get the maximum length of string and use a standard fixed-width char array.
– senderle
Aug 6 '18 at 22:36
@游凯超 if your goal is to use strings as row labels or column headers you should also look into structured arrays.
– senderle
Aug 6 '18 at 22:43
add a comment |
Thanks, your first example is especially helpful--I never would have guessed that behavior! I'm not worried about the speed for this object, so slower access should be fine.
– DilithiumMatrix
Feb 1 '13 at 4:25
1
Nice answer. I've incorporated a link to it with demo into a python notebook page I'm working on about numpy array creation.
– John Lockwood
Mar 20 '15 at 13:15
@senderle if an array is withobject
dtype, thennp.fromstring(arr.tostring())
will fail withnumpy Cannot create an object array from a string
. Any ideas to solve this?
– youkaichao
Jul 31 '18 at 13:58
@游凯超 hmm. That's a tough one. It's not a total surprise because numpy really isn't designed to work with python objects. It's more of a shortcut than a proper use of numpy. So there's no real reason for them to support corner cases like that. My approach would probably be to get the maximum length of string and use a standard fixed-width char array.
– senderle
Aug 6 '18 at 22:36
@游凯超 if your goal is to use strings as row labels or column headers you should also look into structured arrays.
– senderle
Aug 6 '18 at 22:43
Thanks, your first example is especially helpful--I never would have guessed that behavior! I'm not worried about the speed for this object, so slower access should be fine.
– DilithiumMatrix
Feb 1 '13 at 4:25
Thanks, your first example is especially helpful--I never would have guessed that behavior! I'm not worried about the speed for this object, so slower access should be fine.
– DilithiumMatrix
Feb 1 '13 at 4:25
1
1
Nice answer. I've incorporated a link to it with demo into a python notebook page I'm working on about numpy array creation.
– John Lockwood
Mar 20 '15 at 13:15
Nice answer. I've incorporated a link to it with demo into a python notebook page I'm working on about numpy array creation.
– John Lockwood
Mar 20 '15 at 13:15
@senderle if an array is with
object
dtype, then np.fromstring(arr.tostring())
will fail with numpy Cannot create an object array from a string
. Any ideas to solve this?– youkaichao
Jul 31 '18 at 13:58
@senderle if an array is with
object
dtype, then np.fromstring(arr.tostring())
will fail with numpy Cannot create an object array from a string
. Any ideas to solve this?– youkaichao
Jul 31 '18 at 13:58
@游凯超 hmm. That's a tough one. It's not a total surprise because numpy really isn't designed to work with python objects. It's more of a shortcut than a proper use of numpy. So there's no real reason for them to support corner cases like that. My approach would probably be to get the maximum length of string and use a standard fixed-width char array.
– senderle
Aug 6 '18 at 22:36
@游凯超 hmm. That's a tough one. It's not a total surprise because numpy really isn't designed to work with python objects. It's more of a shortcut than a proper use of numpy. So there's no real reason for them to support corner cases like that. My approach would probably be to get the maximum length of string and use a standard fixed-width char array.
– senderle
Aug 6 '18 at 22:36
@游凯超 if your goal is to use strings as row labels or column headers you should also look into structured arrays.
– senderle
Aug 6 '18 at 22:43
@游凯超 if your goal is to use strings as row labels or column headers you should also look into structured arrays.
– senderle
Aug 6 '18 at 22:43
add a comment |
You could use the object data type:
>>> import numpy
>>> s = numpy.array(['a', 'b', 'dude'], dtype='object')
>>> s[0] += 'bcdef'
>>> s
array([abcdef, b, dude], dtype=object)
add a comment |
You could use the object data type:
>>> import numpy
>>> s = numpy.array(['a', 'b', 'dude'], dtype='object')
>>> s[0] += 'bcdef'
>>> s
array([abcdef, b, dude], dtype=object)
add a comment |
You could use the object data type:
>>> import numpy
>>> s = numpy.array(['a', 'b', 'dude'], dtype='object')
>>> s[0] += 'bcdef'
>>> s
array([abcdef, b, dude], dtype=object)
You could use the object data type:
>>> import numpy
>>> s = numpy.array(['a', 'b', 'dude'], dtype='object')
>>> s[0] += 'bcdef'
>>> s
array([abcdef, b, dude], dtype=object)
answered Feb 1 '13 at 4:05
jterracejterrace
44.9k13120166
44.9k13120166
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f14639496%2fhow-to-create-a-numpy-array-of-arbitrary-length-strings%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown