Home |
Search |
Today's Posts |
|
#1
![]()
Posted to microsoft.public.excel.misc
|
|||
|
|||
![]()
Hi
I've tried variations on MID, FIND, LEN, LEFT and Right and still not able to do this. I have strings that can be 89, 90, 91 or 92 characters long and sometimes more. I want to find the name of an enzyme in this string that can vary to between 5 and 12 characters. Something like this: Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXX(h) at 12.0 ATP Concentration Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXXX(h) at 100.0 ATP Concentration Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXXX(h) at 45.0 ATP Concentration I want to list in separate cells: 1) the number of the Daughterboard (286338) 2) the name of the kinase (XXX(h)) 3) The ATP concentration and I don't want to column pharse, if that is the right term. Many thanks for your time Andrew j |
#2
![]()
Posted to microsoft.public.excel.misc
|
|||
|
|||
![]()
I'm a bit confused. Where is the name of your enzyme? What column is
your 89+ character string in? Do you want to split the 3 items out into the next 3 columns for every row of data, or just for some that match the enzyme? Does your data follow the example closely, with variation in character length/position only in the ATP concentration and what you call the kinase? Pete On Jul 2, 10:25*pm, Hankjam ........ wrote: Hi I've tried variations on MID, FIND, LEN, LEFT and Right and still not able to do this. I have strings that can be 89, 90, 91 or 92 characters long and sometimes more. I want to find the name of an enzyme in this string that can vary to between 5 and 12 characters. Something like this: Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXX(h) at 12.0 ATP Concentration Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXXX(h) at 100.0 ATP Concentration Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXXX(h) at 45.0 ATP Concentration I want to list in separate cells: 1) the number of the Daughterboard (286338) 2) the name of the kinase (XXX(h)) 3) The ATP concentration and I don't want to column pharse, if that is the right term. Many thanks for your time Andrew j |
#3
![]()
Posted to microsoft.public.excel.misc
|
|||
|
|||
![]()
On Wed, 2 Jul 2008 14:33:40 -0700 (PDT), Pete_UK
wrote: I'm a bit confused. Where is the name of your enzyme? What column is your 89+ character string in? Do you want to split the 3 items out into the next 3 columns for every row of data, or just for some that match the enzyme? Does your data follow the example closely, with variation in character length/position only in the ATP concentration and what you call the kinase? Pete <snipped Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXX(h) at 12.0 ATP Concentration Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXXX(h) at 100.0 ATP Concentration Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXXX(h) at 45.0 ATP Concentration <snipped more My powers of written communication have never been a strong point. "Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXX(h) at 12.0 ATP Concentration" is one example of a string copied into an excel cell A1. From it I want to extract into three other cells the DB number, the kinase name, which in this case is a three character string and the isoform and the concentration of the ATP 286338 (=MID(A1,30,6), works pretty well) XXX(h) 12.0 (=MID(A1,FIND(" at ",A1)+3,6), works but it is not very tidy) The kinase name can vary in length, there are hundreds of them. The ATP concentration can vary, we use 10 diffferent concentrations. Cannot find anything to work for the variable and unkown nature of the Kinase name, XXX(h) or XXXXXX(h)... Cheers Andrew j |
#4
![]()
Posted to microsoft.public.excel.misc
|
|||
|
|||
![]()
Depending on how accurate your examples were, these might or might not work
for you. These are longer then what you tried because I added the possibility that the data might have varying lengths. With data starting in A1 and going down, Enter this in B1 for the daughterboard number: =MID(A4,FIND("board",A4)+6,6) Enter this in C1 for the name of the kinase enzyme: =MID(SUBSTITUTE(A4,"for","^^",2),FIND("^^",SUBSTIT UTE(A4,"for","^^",2))+3, FIND("(h)",A4)-FIND("^^",SUBSTITUTE(A4,"for","^^",2))) Enter this in D1 for the ATP concentration: =MID(A4,FIND(" at ",A4)+3,FIND("ATP",A4)-FIND(" at ",A4)) Then select B1 to D1, and copy down as needed. -- HTH, RD ================================================== === Please keep all correspondence within the Group, so all may benefit! ================================================== === "Hankjam" ........ wrote in message ... On Wed, 2 Jul 2008 14:33:40 -0700 (PDT), Pete_UK wrote: I'm a bit confused. Where is the name of your enzyme? What column is your 89+ character string in? Do you want to split the 3 items out into the next 3 columns for every row of data, or just for some that match the enzyme? Does your data follow the example closely, with variation in character length/position only in the ATP concentration and what you call the kinase? Pete <snipped Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXX(h) at 12.0 ATP Concentration Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXXX(h) at 100.0 ATP Concentration Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXXX(h) at 45.0 ATP Concentration <snipped more My powers of written communication have never been a strong point. "Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXX(h) at 12.0 ATP Concentration" is one example of a string copied into an excel cell A1. From it I want to extract into three other cells the DB number, the kinase name, which in this case is a three character string and the isoform and the concentration of the ATP 286338 (=MID(A1,30,6), works pretty well) XXX(h) 12.0 (=MID(A1,FIND(" at ",A1)+3,6), works but it is not very tidy) The kinase name can vary in length, there are hundreds of them. The ATP concentration can vary, we use 10 diffferent concentrations. Cannot find anything to work for the variable and unkown nature of the Kinase name, XXX(h) or XXXXXX(h)... Cheers Andrew j |
#5
![]()
Posted to microsoft.public.excel.misc
|
|||
|
|||
![]()
On Wed, 2 Jul 2008 16:00:35 -0700, "RagDyeR"
wrote: Depending on how accurate your examples were, these might or might not work for you. These are longer then what you tried because I added the possibility that the data might have varying lengths. With data starting in A1 and going down, Enter this in B1 for the daughterboard number: =MID(A4,FIND("board",A4)+6,6) Enter this in C1 for the name of the kinase enzyme: =MID(SUBSTITUTE(A4,"for","^^",2),FIND("^^",SUBSTI TUTE(A4,"for","^^",2))+3, FIND("(h)",A4)-FIND("^^",SUBSTITUTE(A4,"for","^^",2))) Enter this in D1 for the ATP concentration: =MID(A4,FIND(" at ",A4)+3,FIND("ATP",A4)-FIND(" at ",A4)) Then select B1 to D1, and copy down as needed. Hi Ragdyer It surely does. Many thanks Yours Andrew j |
#6
![]()
Posted to microsoft.public.excel.misc
|
|||
|
|||
![]()
Thanks for the feed-back.
-- Regards, RD --------------------------------------------------------------------------- Please keep all correspondence within the NewsGroup, so all may benefit ! --------------------------------------------------------------------------- "Hankjam" ........ wrote in message ... On Wed, 2 Jul 2008 16:00:35 -0700, "RagDyeR" wrote: Depending on how accurate your examples were, these might or might not work for you. These are longer then what you tried because I added the possibility that the data might have varying lengths. With data starting in A1 and going down, Enter this in B1 for the daughterboard number: =MID(A4,FIND("board",A4)+6,6) Enter this in C1 for the name of the kinase enzyme: =MID(SUBSTITUTE(A4,"for","^^",2),FIND("^^",SUBST ITUTE(A4,"for","^^",2))+3, FIND("(h)",A4)-FIND("^^",SUBSTITUTE(A4,"for","^^",2))) Enter this in D1 for the ATP concentration: =MID(A4,FIND(" at ",A4)+3,FIND("ATP",A4)-FIND(" at ",A4)) Then select B1 to D1, and copy down as needed. Hi Ragdyer It surely does. Many thanks Yours Andrew j |
#7
![]()
Posted to microsoft.public.excel.misc
|
|||
|
|||
![]()
Okay, I put your example data in A1:A3 and the following formulae in
the cells stated: B1: =MID(A1,30,SEARCH(" ",A1,30)-30) C1: =MID(A1,SEARCH(") for",A1)+6,SEARCH(" at ",A1)-SEARCH(") for",A1)-6) D1: =MID(A1,SEARCH(" at ",A1)+4,FIND(" ATP",A1)-SEARCH(" at ",A1)-4) and this gave the answers you wanted. Better to copy the formulae from here into your sheet, but if you type them in take care with the spaces and be wary of spurious line-breaks that sometimes occur on the newsgroups. Copy B1:D1 down for as many items as you have in column A. Hope this helps. Pete On Jul 2, 10:25*pm, Hankjam ........ wrote: Hi I've tried variations on MID, FIND, LEN, LEFT and Right and still not able to do this. I have strings that can be 89, 90, 91 or 92 characters long and sometimes more. I want to find the name of an enzyme in this string that can vary to between 5 and 12 characters. Something like this: Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXX(h) at 12.0 ATP Concentration Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXXX(h) at 100.0 ATP Concentration Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXXX(h) at 45.0 ATP Concentration I want to list in separate cells: 1) the number of the Daughterboard (286338) 2) the name of the kinase (XXX(h)) 3) The ATP concentration and I don't want to column pharse, if that is the right term. Many thanks for your time Andrew j |
#8
![]()
Posted to microsoft.public.excel.misc
|
|||
|
|||
![]()
One more--with the data in A1:
=MID(A1,30,6) =MID(A1,57,SEARCH(" at ",A1)-57) =MID(A1,SEARCH(" at ",A1)+4,SEARCH(" atp concentration",A1)-SEARCH(" at ",A1)-4) If you want the concentration to be a real number (not text): =--MID(A1,SEARCH(" at ",A1)+4, SEARCH(" atp concentration",A1)-SEARCH(" at ",A1)-4) (Format that cell to show the number of decimals you want.) Hankjam wrote: Hi I've tried variations on MID, FIND, LEN, LEFT and Right and still not able to do this. I have strings that can be 89, 90, 91 or 92 characters long and sometimes more. I want to find the name of an enzyme in this string that can vary to between 5 and 12 characters. Something like this: Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXX(h) at 12.0 ATP Concentration Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXXX(h) at 100.0 ATP Concentration Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXXX(h) at 45.0 ATP Concentration I want to list in separate cells: 1) the number of the Daughterboard (286338) 2) the name of the kinase (XXX(h)) 3) The ATP concentration and I don't want to column pharse, if that is the right term. Many thanks for your time Andrew j -- Dave Peterson |
#9
![]()
Posted to microsoft.public.excel.misc
|
|||
|
|||
![]()
On Wed, 02 Jul 2008 22:25:20 +0100, Hankjam ........
wrote: Hi I've tried variations on MID, FIND, LEN, LEFT and Right and still not able to do this. I have strings that can be 89, 90, 91 or 92 characters long and sometimes more. I want to find the name of an enzyme in this string that can vary to between 5 and 12 characters. Something like this: Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXX(h) at 12.0 ATP Concentration Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXXX(h) at 100.0 ATP Concentration Plate Plan for Daughterboard 286338 (05_jul08_0036) for XXXX(h) at 45.0 ATP Concentration I want to list in separate cells: 1) the number of the Daughterboard (286338) 2) the name of the kinase (XXX(h)) 3) The ATP concentration and I don't want to column pharse, if that is the right term. Many thanks for your time Andrew j Depending on how representative your data above is, the following VBA Macro might do what you want. To enter the macro, <alt-F11 opens the VB Editor. Ensure your project is highlighted in the project explorer window, then Insert/Module and paste the code below into the window that opens. To use this, select some range that has the data to be parsed. Then <alt-F8 opens the macro dialog box. Select the ParseEnzymeString macro and <RUN. The data you requested will be entered into the three adjacent columns. If this is not what you want, you will need to be more specific as to your data formats. ========================================= Option Explicit Sub ParseEnzymeString() Dim c As Range Dim re As Object, mc As Object, m As Object Dim i As Long Set re = CreateObject("vbscript.regexp") re.IgnoreCase = True re.Pattern = "^.*?(\d+).*?for\s+(\S+).*?(\d\S+)" For Each c In Selection Range(c.Offset(0, 1), c.Offset(0, 3)).ClearContents If re.test(c.Value) Then Set mc = re.Execute(c.Value) For i = 1 To 3 c.Offset(0, i).Value = mc(0).submatches(i - 1) Next i End If Next c End Sub ============================== --ron |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
Finding cells of different string length. | Excel Discussion (Misc queries) | |||
String length | Excel Discussion (Misc queries) | |||
Limit to string length in search@replace? | Excel Discussion (Misc queries) | |||
SMALL() on unknown length of list | Excel Worksheet Functions | |||
fixed string length,even other record is copied | Excel Discussion (Misc queries) |