extract all textcontent in htmlcollection to array with javascript


Declare your variables – whenever you assign to or reference a variable without defining it first, you will either (1) implicitly create a property on the global object (which can result in weird bugs), or (2) throw an error, if you’re running in strict mode. You currently aren’t defining any of your variables. Fix it by putting const (or, when needed, let) in front of them when assigning to them for the first time, eg const htmlObject = $(anycasestr);.

jQuery or DOM methods? You’re using jQuery to turn the string into a jQuery collection of elements, but then you’re using getElementsByTagName to select children. If you’re using jQuery, you can be concise and consistent to use it to select the <div> children.. To find children of an element which match a particular tag name, call .find on the jQuery collection – then, you can use .map to turn the found jQuery elements into a collection of just the text of the elements:

const $parent = $(anycasestr);
const arr = $parent.find('div')
  .map((_, child) => child.textContent)
  .get(); // turn the jQuery collection of strings into an array of strings
const anycasestr = `<div style="color: rgb(51, 51, 51); background-color: rgb(253, 246, 227); font-family: Menlo, Monaco, &quot;Courier New&quot;, monospace; font-size: 12px; line-height: 18px;"><div>refinement</div><div>decent</div><div>elegant</div></div>`;
const $parent = $(anycasestr);
const arr = $parent.find('div')
  .map((_, child) => child.textContent)
  .get(); // turn the jQuery collection of strings into an array of strings
console.log(arr);
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>

Or, you can use DOMParser instead. Using DOMParser rather than jQuery to turn the text into a collection of elements can avoid accidental execution of malicious scripts. Example exploit using jQuery:

const anycasestr = `<img src="https://codereview.stackexchange.com/" onerror="alert('evil')">`;
const $parent = $(anycasestr);
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>

With DOMParser:

const anycasestr = `<div style="color: rgb(51, 51, 51); background-color: rgb(253, 246, 227); font-family: Menlo, Monaco, &quot;Courier New&quot;, monospace; font-size: 12px; line-height: 18px;"><div>refinement</div><div>decent</div><div>elegant</div></div>`;
const doc = new DOMParser().parseFromString(anycasestr, 'text/html');
const arr = (...doc.querySelectorAll('div > div'))
  .map(div => div.textContent);
console.log(arr);

The query string div > div selects <div> elements which are direct children of another <div>. It works exactly the same way as CSS selectors. querySelectorAll is a great tool for concise selection of elements – it can be easier to write and understand at a glance than other methods (like your original htmlObject(0).getElementsByTagName("div")).

Array.prototype.slice.call is a bit verbose – on non-ancient environments, you can use spread syntax instead, like I did above. Creating an array all at once by mapping is also somewhat more elegant than declaring an array then .pushing onto it.

If you had more <div> children and wanted to take only the text from the first 3 of them, it’d be more functional to .slice the array of elements instead of putting an iteration count in a for loop:

const anycasestr = `<div style="color: rgb(51, 51, 51); background-color: rgb(253, 246, 227); font-family: Menlo, Monaco, &quot;Courier New&quot;, monospace; font-size: 12px; line-height: 18px;">
  <div>refinement</div>
  <div>decent</div>
  <div>elegant</div>
  <div>don't include me</div>
  <div>don't include me</div>
  <div>don't include me</div>
</div>`;
const doc = new DOMParser().parseFromString(anycasestr, 'text/html');
const arr = (...doc.querySelectorAll('div > div'))
  .slice(0, 3)
  .map(div => div.textContent);
console.log(arr);

in terms of computational concerns?

Unless the stuff that needs to be parsed is unreasonably large, performance for this sort of thing is not a concern; better to write clean, readable, maintainable code. If you later find that something is taking longer to run than is ideal, you can identify the bottleneck and then figure out how to fix it. (But this almost certainly won’t be the bottleneck.)