Visual LISP, AutoLISP and General Customization
cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

String Manipulation wildcards

13 REPLIES 13
Reply
Message 1 of 14
jcourtne
2246 Views, 13 Replies

String Manipulation wildcards

i'm trying to get rid of some text from a comma delimited string. Anyone have some siple code to do this?

I could normally do it fast, but the wildcard is giving my written subfunction some bugs.

 Trying to get rid of:

                                                      *SP002, *SP003, *SP004

from a string of wire names such as:

                                             GTR1,  *SP002, *SP003, *SP004, GTR5

 

Note the wildcard '*' is actually in the string.

Thanks ahead.

 

13 REPLIES 13
Message 2 of 14
SomeBuddy
in reply to: jcourtne

Hi,

 

You can try the next function. It takes two arguments, the string to be cleaned and the pattern to serch for elimination.

 

In this case, because you have a wildcard (asterisk) to be escaped (to be red literally) in the pattern, you have to use the reverse quote and your pattern is:

 

"*`*SP*"

 

Usage:

Command: (cleanstr "GTR1, *SP002, *SP003, *SP004, GTR5" "*`*SP*")

"GTR1,GTR5"

 

(defun cleanstr (string patern / str2lst)
  (defun str2lst (str sep / pos)
    (if (setq pos (vl-string-search sep str))
      (cons (substr str 1 pos)
        (str2lst (substr str (+ (strlen sep) pos 1)) sep)
      )
      (list str)
    )
  )
  (apply
    'strcat
    (vl-remove-if
     '(lambda (x)(wcmatch x pattern))
      (mapcar
       '(lambda (y)(vl-string-subst "," " " y))
        (str2lst string ",")
      )
    )
  )
)

 

 

 

Message 3 of 14
Kent1Cooper
in reply to: jcourtne


@jcourtne wrote:

....

 Trying to get rid of:

                                                      *SP002, *SP003, *SP004

from a string of wire names such as:

                                             GTR1,  *SP002, *SP003, *SP004, GTR5

 

Note the wildcard '*' is actually in the string.

....


 

Look at the reverse-quote ` character in (wcmatch), which is to "escape special characters," or to read the next character literally.

 

(wcmatch "YourString"  "`**")

 

returns T for strings that start with a *, so you can [for example] subdivide the longer string into a list of shorter ones with a specified delimiter [see recent thread], and use that test to eliminate the strings you don't want.

Kent Cooper, AIA
Message 4 of 14
jcourtne
in reply to: SomeBuddy

I have a hard time understanding the lambda functions but it looks more efficient than my code.

Is it possible to generalize the code to work with non comma delimited text and still use wildcards?

 

I'm trying to use this as a subfunction for three other functions.

I would use vl-string-subst except it doesn't seem to work with wildcards.

Thanks.

Message 5 of 14
SomeBuddy
in reply to: jcourtne

Can you be more specific about "to work with non comma delimited text and still use wildcards" ? If you look at the code, it transforms the string in a lists of strings and then it take each string item in that list, searches for the pattern, then "stiches" the remaining string item in on big string.

 

So when you say "to work with non comma delimited text and still use wildcards" is looks too vague to me! Like searching in a page of text, or what? This would be a little more elaborate, but still duable with TXT files (not with DOC files for exemple, which would befar more complicated)

 

As for lambda function, it's like defining a function on the spot, but it's called an anonymous function, since it has no name. It's a generic function which works as any other defined function, accepting arguments and also local variables. It also makes the programmer's intention more apparent by laying out the function at the spot where it is to be used. This function returns the value of its last expr, and is often used in conjunction with apply and/or mapcar to perform a function on a list.

 

Message 6 of 14
jcourtne
in reply to: jcourtne

I will certainly elaborate.

I want to change 3NEZSC4075 to 3NEZSB4075.

Swapping the C to a B.

Or like previously stated swap *SP### to ""

Or BH-EM-5004 to BH-MA-5004

Swapping the EM to MA

Or using wildcards: swapping -@@- to -MA-

that way i don't have to specify when i click on something with the pattern BH-EM-5004.

 

Forget the comma delimited part. If i can replace all *SP### with blanks then all i have to do is run the function again while passing the arguments ", ," "," to eliminate the extra commas.

 

I'm looking to make this a subfunction so leave the parameters you have.

I will work on what you gave me to see if I can manipulate it into what I'm looking for.

However, I'll take any and all help you or others would give.

Thanks.

Message 7 of 14
SomeBuddy
in reply to: jcourtne

The most important aspect here is how your text is structured. All these 3NEZSC4075,  *SP###, BH-EM-5004 and any other pattern were are they, are they floating like this, how are they structured, are they in a masive block of text, in a TXT file, are they selected on screen, listed somewhere, are they separated by something else ?

 

If you don't want the commas, it's easy to remove them, you don't have to pass the function one more time, but I have to understand the the final output of your text as wel as the input structure of your text, because when I go "there" to read it, I have to "have the proper tools" for that specific situation.

 

Maybe you can provide some concrete exemples a file to be red and what would you like to be the final output.

Message 8 of 14
jcourtne
in reply to: SomeBuddy

First

I am planning on putting a button on a tool palette that runs a lisp.

The lisp will prompt the user for two strings. What they want to change, and what they want to change it to.

After that they click on text or attribute and it gets the TextString property runs the function we're talking about then pops the return value back in and allows them to continue clicking.  This function should allow them to enter wildcards.  I did have that function working then I tried to make a sub function out of the guts.

Second

I have a function that allows the user to click on a block. The function then reads the ID attribute(subfunction) and goes to a specified text file where a database has written a text file. The function reads a particular column(subfunction) and returns the whole line. Then another function parses the tab delimeted strings(subfunction) and returns string values for each attribute in the block. *whew*

So with this text that has been returned for one attribute it is formatted from the database per my first post:

*SP###, *SP###, *SP###

and instead of breaking the string up I was creating this subfunction to run on the line and get rid of all *SP###'s and , ,'s

 

In conclusion, I'm working on a function (yStringSwap stringarg patternOUT patternIN)

Similar to the vl-string-subst only it works with wildcards and replaces all occurances.

Or I was thinking that it might be possible to add a boolean parameter to let the parent function decide whether the function should return a string with all occurances replaced or just the first.

 

Thanks for reading this whole thing. I probably would not have...

Message 9 of 14
SomeBuddy
in reply to: jcourtne

I'm sorry, no ofense, but I have a hard time trying to understant what you want to do. Let's go one step at a time, so let's talk about the first one and let's work a little bit more on breaking in in small steps.

So I understand that you ask for two strings, one would be the existing string and the second the new string to be. But here I lose it. You click on an atribute and you get its TextString property, and replace the TextString value with the new one? But then why are you asking for an existing string in the begining, if ou are going to search attributes for a target ?

 

Do you mean that in the begining you ask for a specific pattern to search for and for a replacement pattern, and then you keep selecting attributes and text in searc of that pattern to be replaced ? If you can confirm this, it's OK, but then  what do you mean by "pops the return value back in" and what or which is the returned value at this moment ?

 

Let's finish this and then we'll se about the second part.

Message 10 of 14
SomeBuddy
in reply to: jcourtne

I'm afraid that I can't do this and I don't even know if this is doable.

 

You see, in that function it's easier, because it operates with strings and patterns of the same length, but if it's to replace some pattern using wildcards in a bigger string, it becomes much too difficult, because of the complex analyze that has to be performed.

 

One can certainly identify a pattern using (wcmatch ..), but then has no information about what the matching pattern from the string contains and where in the string is placed. It seems to me that (wcmatch.....) returns a very vague confirmation, like "yep, there is

  something in that string matching your pattern convention, but I won't tell  you what it is and where it is".

 

In an ideal world, (wcmatch ...) would have returned something like a dotted pair, where the first element to be the position where the matching pattern starts and the second element the real matching string chunk, something like, instead of

 

(wcmatch "Serial number is Q123XLS-5234-QZ" "*Q###???*")

 

returning T, it would return something like

 

(17 . "Q123XLS")

 

Imagine how easy it would have been to achieve what you want to do!!! But unfortunately this is not the case 😞 So if you want to start analyzing it, most probably in theory it could be done, but in practice it would become a huge mess.

 

Think that there are 10 numerical digit, 25 or something alphabetical characters in lower case, multiply this by two for the upper case, then add some nonalphanumeric characters and then the wildcards, which have to be treated in a specific way. Then try to imagine in how many ways that would match a specific pattern they could be combined and how would you analyze all this stuff so that you can identify what could have matched your pattern and where is it placed!!! And obviously, the longer the pattern, the more complex the analyze becomes!

 

I don't feel like I am able to solve this and I wonder if somebody else would. And I think that this is the reason why the Find command in AutoCAD is not able to use wildcards, because it would have been far too complicated to build it.

 

Hope I made myself clear

Message 11 of 14
Kent1Cooper
in reply to: SomeBuddy


@SomeBuddy wrote:

....

(defun cleanstr (string patern / str2lst)
....

     '(lambda (x)(wcmatch x pattern))
....


[You're somewhat beyond this in the discussion, but if SomeBuddy's routine enters back into it, one thing I found is that it will need the "patern" in the arguments list changed to "pattern" to agree with the usage later on.  Before doing that, I got an error message when I tried it.]
Kent Cooper, AIA
Message 12 of 14
jcourtne
in reply to: jcourtne

Here is some code that I have written to handle 'most' of my string problems.

It is in no way assured to be bug free or written to handle any situation but it works for me.

Test it out, tell me what you think

 

(defun yFindStrMatch (s_Source s_Pattern / i_Length i j s_subSource l_Ret)
;;; Finds the first pattern in the string and returns a list
;;; The list contains the start and length of the matched string from s_Source
;;; Returns nil if No pattern was found
;;; Will not check for sublists or invalid sub data types.
;;; No error rescuing functions only primary data type checking

    (setq i_Length (strlen s_Source))
    (setq i 1)
    (setq j 1)
    (setq s_subSource (substr s_Source i j))
   
    (while (<= j i_Length)
        (while (<= (+ i j -1) i_Length)
            (setq s_subSource (substr s_Source i j))
            (if (wcmatch s_subSource s_Pattern)
                (progn
                    (setq i_Length 0)
                    (setq l_Ret (list i j))
                )
            )
            (setq i (+ i 1))
        )
        (setq j (+ j 1))
        (setq i 1)
    )
    l_Ret
) ;_ End of function yFindStrMatch
;;;****************************************************************************

 

(defun yStringSwap ( s_Source s_RemoveThis s_PutThis /
                    l_MatchIndex i_Start i_Len s_Prefix s_Suffix s_Return)
;;; Removes a part of a string and replaces it with another
;;; Parameters:
;;;     s_Source        string      Generic string
;;;     s_RemoveThis    string      Pattern to remove from s_Source
;;;     s_PutThis       string      Exact characters to put in place of s_RemoveThis

    ;; Attempt to find the pattern they want to get rid of
    (setq l_MatchIndex (yFindStrMatch s_Source s_RemoveThis))
    (while l_MatchIndex
        (setq i_Start (car l_MatchIndex))
        (setq i_Len (cadr l_MatchIndex))

        ;; Get everything before the pattern
        (if (> i_Start 1)
            (setq s_Prefix (substr s_Source 1 (- i_Start 1)))
            (setq s_Prefix "")
        )
        ;; Get everything after the pattern
        (setq s_Suffix (substr s_Source (+ i_Start i_Len)))
        ;; Set the return string
        (setq s_Source (strcat s_Prefix s_PutThis s_Suffix))
        ;; Find the next match
        (setq l_MatchIndex (yFindStrMatch s_Source s_RemoveThis))
    )
    s_Source
) ;_ End of function yStringSwap
;;;****************************************************************************

(defun c:yTxtChange ( / s_PatternOut s_PatternIn o_TextObject s_Source
                            s_Refill)
;;; Allows users to sequentially select text fields.
;;; In each field it will change a pattern the user entered to another pattern the user entered.
;;; Note: Intention of this function is to work on the first occurence of the first character

    ;; Prompt user for pattern
    (setq s_PatternOut (getstring "\nType letters do you want to replace: "))
    (setq s_PatternIn (getstring "\nType what you would like to put in their place: "))
    ;; The user will click on all the text they want the patterns to swap in.
    (while T
        (setq o_TextObject ((vlax-ename->vla-object (car(nentsel "\nPick each text you want changed: "))))
        ;; Get text already in object
        (setq s_Source (vlax-get-property o_TextObject "TextString"))
        ;; Find where the pattern exists in the text
        (if s_Source
            (progn
                (setq s_Refill (yStringSwap s_Source s_PatternOut s_PatternIn))
                ;; Populate the object with the desired full string
               (vlax-put-property o_TextObject "TextString" s_Refill)
                ;; Add what was changed to the end of the line
                (princ (strcat s_Source " -> " s_Refill))
            )
            ;; Otherwise tell user that object does not contain text.
            (princ "\nThat object does not have text.")
        )
        ;; Otherwise tell user pattern was not found
        (princ "\nPattern not found in text.")
    )
    ;; Exit quietly
    (princ)
) ;_ End of function c:yTxtChange
;;;****************************************************************************

Message 13 of 14
SomeBuddy
in reply to: Kent1Cooper

Thanks for pointing this out, the typical typo 🙂

Message 14 of 14
Kent1Cooper
in reply to: jcourtne


@jcourtne wrote:

i'm trying to get rid of some text from a comma delimited string. .... the wildcard is giving my written subfunction some bugs.

 Trying to get rid of:

                                                      *SP002, *SP003, *SP004

from a string of wire names such as:

                                             GTR1,  *SP002, *SP003, *SP004, GTR5

 

Note the wildcard '*' is actually in the string.

....


By now, the attached is not relevant to the further-defined needs on this thread, but it does work to answer the original situation, in case someone comes along who needs that.  I was wondering whether a different approach might shorten the code or the operation, compared to the approach that SomeBuddy took.  It turned out not to be shorter code, and whether it's faster would probably depend on the nature of the string to be processed.  But it was an interesting exercise, so here it is.

 

It does not use (wcmatch), so the use of a character that (wcmatch) considers a wildcard presents no difficulty [again, as in the original question, no longer appropriate to the later thread].  And it passes by any number of delimiters that are not followed immediately by the "bad" starting character(s), all at once, so it does not need to consider every delimited substring.

 

Basically, instead of pulling the string apart around the delimiters and performing some operation on every substring individually, it only does an operation at those substrings that are to be removed [plus a little starting and ending stuff], and except in the immediate vicinity of those, does not otherwise bother even to look at delimiters, much less to subdivide the string around them.

 

That could possibly make it faster, especially in a string with few substrings to be removed.  Even if that's the case, of course, the difference would probably not be noticeable to a User except in very long strings.  But I had fun figuring it out, anyway.

Kent Cooper, AIA

Can't find what you're looking for? Ask the community or share your knowledge.

Post to forums  

Autodesk Design & Make Report

”Boost