Sum of specific letters of occurrence in an alphanumeric string using Excel

I have an array of alphanumeric data used when testing applications, and for certain reasons I need to calculate the sum of occurrences of letters from "a" to "f" in each line (this will be for further data processing):

02599caa0b600 --> should be 4 489455f183c1fb49b --> should be 5 678661081c1h 66410hd2f0kxd94f5bb 8a0339a4417 f6d9f967ts4af6e 886sf7asc3e85ec 03f1fhh3c3a2am e491b17638m60 1m8h2m07bhaa4tnhbc4 29ma900a80m96m65 ca6a75f505tsac8 956828db8ts7fd1d cf1d220a59a7851180e a8b7852xd9e7a9 b85963fbe30718db9976 39b8kx8f85abb1b6 0xxb3b648ab a8da75f730d45048 588h69d344 

Here's what the lines look like: their length is about 10-30 characters, and I suggest that they have about 3-5k of them daily for processing. Assumptions and limitations:

  • The case with letters does NOT matter (happily).
  • The list of letters may change one day, but most likely there is still a range , for example. ak, dg etc. - therefore, the solution should be as flexible as possible.
  • Any time calculations / ranges are not prohibited, but the shorter the better.
  • I would prefer a clean solution for Excel, but if it is too complicated, VBA is still an option. However, a complex Excel formula is better than VBA “2-line code” - if the former works as expected.

Things I've tried so far (as I noticed, this practice is very welcome):

  • A search on already answered questions, but did not find Excel-based solutions for anything like that. Other languages ​​/ approaches are not an option (except for VBA).
  • The best I have received so far is the nested SUBSTITUTE functions, but it is dirty and very simple. Assuming the range may change to cx, this will be a nightmare.
  • I'm not new to Excel, but things like complex array formulas are still hard nuts for me - alas, but true ...

In any case, I do not ask for a “ready-to-work” “out of the box” solution — I ask for help and the right direction / approach for self-training and further understanding of such problems.

+4
source share
4 answers

Here my option is quite similar to the one already sent, but in any case ... especially if you are interested in learning, which is so rare today)

Assuming you have a list starting with A2, use the following array formula:

 =SUM(LEN($A2)-LEN(SUBSTITUTE($A2,CHAR(ROW(INDIRECT(CODE("a")&":"&CODE("f")))),""))) 

As a reminder, press CTRL + SHIFT + ENTER instead of the usual ENTER .

Some explanations:

  • The letter range af is generated using the char range codes of the edges of the range, converted back to an array of characters using the CHAR(ROW(INDIRECT(...))) structure CHAR(ROW(INDIRECT(...))) .
  • Then comes the "nightmare", summing up the resulting numbers of substituted vs original subtractions of the rows.
  • Thus, in the case of such a double conversion, you do not need a code one)))

And two more similar samples of "nuts" - for educational purposes only.

If you need to sum all digits of digits , you can still use the above using 0 and 9 as input (numbers are characters from 48 to 57 codes starting with 0). However, an even simpler solution will follow:

 =SUM(LEN($A2)-LEN(SUBSTITUTE($A2,ROW($1:$10)-1,""))) 

The trick here is that we can generate numbers 0-9 using the numbers of arrays or strings 1-10 minus 1 - ROW(0) to generate an error.

Finally, if you need to calculate the sum of all the digits in a string, use this:

 =SUM(IFERROR(VALUE(MID($A2,ROW(INDIRECT("1:"&LEN($A2))),1)),0)) 

Here we disintegrate the initial string into letters using MID for each single char, and then check it for a number with IFERROR and return 0 for anything other than a digit.

The last 2 (obviously) your favorite massive nuts)))

I use the above examples in my Excel training for QA materials (by the way, welcome to SE, colleague!), Thereby demonstrating typical functions / approaches to cracking nuts. I hope this was useful to you. However, all of the previous answers deserve at least your fair promotion, especially over-the-counter @barry prescription)

For your convenience, the sample file is shared: https://www.dropbox.com/s/qo5k479oyawkrzh/SumLettersCount.xlsx

Good luck in testing)

+4
source

You can use SUBSTITUTE without inserting several SUBSTITUTE functions, for example. with a text string in A1, this formula in B1 will count all letters from a to f (upper or lower case)

=SUMPRODUCT(LEN(A1)-LEN(SUBSTITUTE(LOWER(A1),{"a","b","c","d","e","f"},"")))

for a longer list of letters like cx, you can use this version to not list them all

=SUMPRODUCT(LEN(A1)-LEN(SUBSTITUTE(LOWER(A1),CHAR(96+ROW(INDIRECT("3:24"))),"")))

3:24 represents the letter 3 (c) to 24 (x), so you can easily change it to 1:26 for all letters or 15:25 for o and y, etc.

+6
source

This formula assumes your data is in column A, the first letter of the range you are looking for is in F1, and the last letter in G1. It should be entered as an array formula and then copied to the bottom of your data.

  =SUM(--(UPPER(MID(A1,ROW(INDIRECT("1:"&LEN(A1))),1))>=UPPER($F$1))*--(UPPER(MID(A1,ROW(INDIRECT("1:"&LEN(A1))),1))<=UPPER($G$1))). 

Note that if the range of letters you want to change, you will need to change the first letter of the range in cell F1 and the last letter in G1.

If you are sure that the number of characters in any of the lines will not exceed any maximum number, say 50, then the formula can be simplified to:

  =SUM(--(UPPER(MID(A1,ROW($1:$50),1))>=$F$1)*--(UPPER(MID(A1,ROW($1:$50),1))<=$G$1)) 
+3
source

Assuming your data is in column A, try the following formula:

 =SUM(--NOT(ISERROR(SEARCH(MID(A1,ROW($1:$99),1),"abcdef"))))-99+LEN(A1) 

Enter the formula as an array formula, i.e. press Ctrl - Shift - Enter .

0
source

All Articles