programing

문자열 끝에서 서브스트링을 제거하려면 어떻게 해야 하나요?

bestcode 2022. 9. 8. 22:06

문자열 끝에서 서브스트링을 제거하려면 어떻게 해야 하나요?

다음 코드가 있습니다.

url = 'abcdc.com'
print(url.strip('.com'))

★★★★★★★★★★★★★★」abcdc

abcd

이제 나는 한다.

url.rsplit('.com', 1)

더 좋은 방법이 있을까요?

strip "이 서브스트링을 제거한다"는 뜻이 아닙니다. x.strip(y)y으로서, 그의 모든 를, 「어느 쪽이든), (어느 쪽이든), (어느 쪽이든) 합니다.x.

Python 3.9 이후에서는 및 메서드를 사용하여 문자열 양쪽에서 서브스트링 전체를 삭제할 수 있습니다.

url = 'abcdc.com'
url.removesuffix('.com')    # Returns 'abcdc'
url.removeprefix('abcdc.')  # Returns 'com'

관련 Python 확장 제안은 PEP-616입니다.

Python 3.8 이상에서는 슬라이싱을 사용할 수 있습니다.

url = 'abcdc.com'
if url.endswith('.com'):
    url = url[:-4]

또는 정규 표현:

import re
url = 'abcdc.com'
url = re.sub('\.com$', '', url)

문자열이 마지막에만 표시되는 것이 확실하다면 가장 간단한 방법은 'replace'를 사용하는 것입니다.

url = 'abcdc.com'
print(url.replace('.com',''))

def strip_end(text, suffix):
    if suffix and text.endswith(suffix):
        return text[:-len(suffix)]
    return text

아직 아무도 이에 대해 지적하지 않은 것 같기 때문에:

url = "www.example.com"
new_url = url[:url.rfind(".")]

이 더 예요.split()새로운 목록 객체가 생성되지 않으며 이 솔루션은 여러 개의 점이 있는 문자열에 대해 작동합니다.

작작으로 Python 3.9대신 을 사용할 수 있습니다.

'abcdc.com'.removesuffix('.com')
# 'abcdc'

URL에 대해 무엇을 알고 있는지, 무엇을 하려고 하는지에 따라 달라집니다.항상 '.com'(또는 '.net' 또는 '.org')로 끝나는 것을 알고 있는 경우

 url=url[:-4]

가장 빠른 해결책입니다.좀 더 일반적인 URL이라면 python과 함께 제공되는 urlparse 라이브러리를 살펴보는 것이 좋습니다.

반대로 문자열의 마지막 '.' 뒤에 있는 모든 항목을 삭제하는 경우,

url.rsplit('.',1)[0]

효과가 있습니다.또는 첫 번째 '까지 모든 것을 원하신다면 시도해 보세요.

url.split('.',1)[0]

내선번호인 걸 알면

url = 'abcdc.com'
...
url.rsplit('.', 1)[0]  # split at '.', starting from the right, maximum 1 split

잘 .abcdc.com ★★★★★★★★★★★★★★★★★」www.abcdc.com ★★★★★★★★★★★★★★★★★」abcdc.[anything]확장성이 뛰어납니다.

Python 3.9+의 경우:

text.removesuffix(suffix)

모든 Python 버전:

def remove_suffix(text, suffix):
    return text[:-len(suffix)] if text.endswith(suffix) and len(suffix) != 0 else text

또는 원라이너:

remove_suffix = lambda text, suffix: text[:-len(suffix)] if text.endswith(suffix) and len(suffix) != 0 else text

는 요?url[:-4]

URL 의 경우(이 예에서는 토픽의 일부인 것 같기 때문에), 다음과 같은 조작을 실시할 수 있습니다.

import os
url = 'http://www.stackoverflow.com'
name,ext = os.path.splitext(url)
print (name, ext)

#Or:
ext = '.'+url.split('.')[-1]
name = url[:-len(ext)]
print (name, ext)

다 다음과 같이됩니다.('http://www.stackoverflow', '.com')

이, 음, 음, 음, 음, this, this, this, this, 다, this, this, this, this, this, this, this, this, this, thisstr.endswith(suffix)을 분할하거나 을 분할해야 할 com을 분할할 필요가 있는 경우 또는 특정 항목을 입력합니다.

DSCAIMER 이 메서드는 파티션이 URL 끝에 고정되지 않아 잘못된 결과를 반환할 수 있다는 점에서 중대한 결함이 있습니다.예를 들어, URL "www.comcast.net"의 결과는 예상된 "www.comcast.net"이 아닌 "www"(최소)입니다.따라서 이 해결책은 사악하다.무엇을 하고 있는지 모를 때는 그것을 사용하지 마세요!

url.rpartition('.com')[0]

이것은 입력하기 쉽고 접미사 '.com'이 없을 때 원래 문자열(오류 없음)을 올바르게 반환합니다.url.

도메인(.com, .net 등)에 관계없이 도메인을 삭제하는 경우.를 찾을 것을 권장합니다..그 시점부터 모든 것을 제거하는 거죠

url = 'abcdc.com'
dot_index = url.rfind('.')
url = url[:dot_index]

여기서 사용하고 있습니다.rfind다음과 같은 URL 문제를 해결하기 위해abcdc.com.net그 이름만큼 줄여서 말해야 한다.abcdc.com.

만약 당신이 또한 걱정된다면www.s, 명시적으로 체크할 필요가 있습니다.

if url.startswith("www."):
   url = url.replace("www.","", 1)

대체의 1은 다음과 같은 이상한 엣지 케이스에 사용됩니다.www.net.www.com

만약 당신의 URL이 그 이상으로 거칠어지면, 사람들이 응답한 regex의 답변을 보세요.

확장만 삭제하는 경우:

'.'.join('abcdc.com'.split('.')[:-1])
# 'abcdc'

파일명에 존재할 가능성이 있는 다른 점들도 포함하여 모든 확장자와 함께 작동합니다.단순히 문자열을 점의 목록으로 분할하고 마지막 요소 없이 결합합니다.

문자열이 있는 경우 문자열의 일부 끝을 제거해야 할 경우 아무 작업도 수행하지 않습니다.최고의 솔루션아마 처음 2개의 구현 중 하나를 사용하고 싶을 것입니다만, 완성도를 위해 3번째 구현도 포함했습니다.

접미사가 일정한 경우:

def remove_suffix(v, s):
    return v[:-len(s)] if v.endswith(s) else v
remove_suffix("abc.com", ".com") == 'abc'
remove_suffix("abc", ".com") == 'abc'

정규식의 경우:

def remove_suffix_compile(suffix_pattern):
    r = re.compile(f"(.*?)({suffix_pattern})?$")
    return lambda v: r.match(v)[1]
remove_domain = remove_suffix_compile(r"\.[a-zA-Z0-9]{3,}")
remove_domain("abc.com") == "abc"
remove_domain("sub.abc.net") == "sub.abc"
remove_domain("abc.") == "abc."
remove_domain("abc") == "abc"

상수 서픽스 집합의 경우 다수의 콜에 대해 점근적으로 가장 빠른 방법입니다.

def remove_suffix_preprocess(*suffixes):
    suffixes = set(suffixes)
    try:
        suffixes.remove('')
    except KeyError:
        pass

    def helper(suffixes, pos):
        if len(suffixes) == 1:
            suf = suffixes[0]
            l = -len(suf)
            ls = slice(0, l)
            return lambda v: v[ls] if v.endswith(suf) else v
        si = iter(suffixes)
        ml = len(next(si))
        exact = False
        for suf in si:
            l = len(suf)
            if -l == pos:
                exact = True
            else:
                ml = min(len(suf), ml)
        ml = -ml
        suffix_dict = {}
        for suf in suffixes:
            sub = suf[ml:pos]
            if sub in suffix_dict:
                suffix_dict[sub].append(suf)
            else:
                suffix_dict[sub] = [suf]
        if exact:
            del suffix_dict['']
            for key in suffix_dict:
                suffix_dict[key] = helper([s[:pos] for s in suffix_dict[key]], None)
            return lambda v: suffix_dict.get(v[ml:pos], lambda v: v)(v[:pos])
        else:
            for key in suffix_dict:
                suffix_dict[key] = helper(suffix_dict[key], ml)
            return lambda v: suffix_dict.get(v[ml:pos], lambda v: v)(v)
    return helper(tuple(suffixes), None)
domain_remove = remove_suffix_preprocess(".com", ".net", ".edu", ".uk", '.tv', '.co.uk', '.org.uk')

마지막 것은 아마도 cpython보다 pypy가 훨씬 더 빠를 것이다.regex 변형은 적어도 cPython에서는 regex로 쉽게 나타낼 수 없는 잠재적인 접미사의 거대한 사전을 포함하지 않는 거의 모든 경우에 대해 이보다 더 빠를 수 있습니다.

PyPy에서, 람다의 오버헤드의 대부분이 JIT에 의해 최적화되기 때문에 re 모듈이 DFA 컴파일 regex 엔진을 사용하더라도 많은 수의 콜이나 긴 문자열에 대해 regex 변형이 거의 확실히 느립니다.

그러나 cPython에서는 regex에 대해 실행 중인 c코드가 거의 확실하게 서픽스 컬렉션 버전의 알고리즘상의 이점보다 우선합니다.

편집 : https://m.xkcd.com/859/

이것은 매우 인기 있는 질문이기 때문에, 현재 이용 가능한 다른 솔루션을 추가합니다.python 3.9에서는 (https://docs.python.org/3.9/whatsnew/3.9.html) the function)removesuffix()추가된다 (및removeprefix()이 함수는 여기서 질문한 것과 같습니다.

url = 'abcdc.com'
print(url.removesuffix('.com'))

출력:

'abcdc'

PEP 616(https://www.python.org/dev/peps/pep-0616/) 는, 동작 방법을 나타내고 있습니다(실제 실장은 아닙니다).

def removeprefix(self: str, prefix: str, /) -> str:
    if self.startswith(prefix):
        return self[len(prefix):]
    else:
        return self[:]

자체 개발 솔루션에 대한 이점:

깨지기 쉬움:코드는 사용자가 리터럴 길이를 세는 것에 의존하지 않습니다.
퍼포먼스 향상:이 코드에서는 Python 내장 len 함수나 더 비싼 str.replace() 메서드에 대한 호출이 필요하지 않습니다.
자세한 설명:이 메서드는 기존의 문자열 슬라이스 방식과는 달리 코드 가독성을 위한 보다 높은 수준의 API를 제공합니다.

import re

def rm_suffix(url = 'abcdc.com', suffix='\.com'):
    return(re.sub(suffix+'$', '', url))

저는 이 대답을 가장 표현력 있는 방법으로 반복하고 싶습니다.물론 CPU 시간은 다음과 같이 단축됩니다.

def rm_dotcom(url = 'abcdc.com'):
    return(url[:-4] if url.endswith('.com') else url)

하지만 CPU가 병목이라면 왜 Python으로 쓰나요?

근데 CPU가 언제 병목이에요?운전사라면, 아마도.

정규 표현을 사용하는 장점은 코드 재사용 가능성입니다.다음 번에 3글자밖에 없는 '.me'를 제거하려면 어떻게 해야 합니까?

같은 코드로도 효과를 볼 수 있습니다.

>>> rm_sub('abcdc.me','.me')
'abcdc'

제 경우 예외를 제기할 필요가 있었기 때문에 다음과 같이 했습니다.

class UnableToStripEnd(Exception):
    """A Exception type to indicate that the suffix cannot be removed from the text."""

    @staticmethod
    def get_exception(text, suffix):
        return UnableToStripEnd("Could not find suffix ({0}) on text: {1}."
                                .format(suffix, text))


def strip_end(text, suffix):
    """Removes the end of a string. Otherwise fails."""
    if not text.endswith(suffix):
        raise UnableToStripEnd.get_exception(text, suffix)
    return text[:len(text)-len(suffix)]

분할을 사용할 수 있습니다.

'abccomputer.com'.split('.com',1)[0]
# 'abccomputer'

서픽스를 치환하고(빈 문자열로 치환하여 삭제할 수 있음), 치환의 최대수를 설정하는 광범위한 솔루션:

def replacesuffix(s,old,new='',limit=1):
    """
    String suffix replace; if the string ends with the suffix given by parameter `old`, such suffix is replaced with the string given by parameter `new`. The number of replacements is limited by parameter `limit`, unless `limit` is negative (meaning no limit).

    :param s: the input string
    :param old: the suffix to be replaced
    :param new: the replacement string. Default value the empty string (suffix is removed without replacement).
    :param limit: the maximum number of replacements allowed. Default value 1.
    :returns: the input string with a certain number (depending on parameter `limit`) of the rightmost occurrences of string given by parameter `old` replaced by string given by parameter `new`
    """
    if s[len(s)-len(old):] == old and limit != 0:
        return replacesuffix(s[:len(s)-len(old)],old,new,limit-1) + new
    else:
        return s

이 경우 기본 인수를 지정하면 원하는 결과를 얻을 수 있습니다.

replacesuffix('abcdc.com','.com')
>>> 'abcdc'

일반적인 예를 다음에 제시하겠습니다.

replacesuffix('whatever-qweqweqwe','qwe','N',2)
>>> 'whatever-qweNN'

replacesuffix('whatever-qweqweqwe','qwe','N',-1)
>>> 'whatever-NNN'

replacesuffix('12.53000','0',' ',-1)
>>> '12.53   '

이것은 정규 표현에 매우 적합합니다.

>>> import re
>>> re.match(r"(.*)\.com", "hello.com").group(1)
'hello'

여기, 가장 간단한 코드가 있어요

url=url.split(".")[0]

Python > = 3.9:

'abcdc.com'.removesuffix('.com')

Python < 3 . 9 :

def remove_suffix(text, suffix):
    if text.endswith(suffix):
        text = text[:-len(suffix)]
    return text

remove_suffix('abcdc.com', '.com')

치환 및 카운트 사용

수 , 할 수 있도록 .startswith 스테이트먼트, if if 「 if 」를 합니다.count할 수 .치환의 경우는 치환의 1로 제한할 수 있습니다.

mystring = "www.comwww.com"

접두사:

print(mystring.replace("www.","",1))

는, 「」( 「프리픽스」를 참조해 주세요)..com becomes가 되다moc.:

print(mystring[::-1].replace("moc.","",1)[::-1])

나는 follow 그렇게 하면:내장된 rstrip 기능을 사용했다.

string = "test.com"
suffix = ".com"
newstring = string.rstrip(suffix)
print(newstring)
test

언급URL:https://stackoverflow.com/questions/1038824/how-do-i-remove-a-substring-from-the-end-of-a-string

'programing' 카테고리의 다른 글

슬롯 분할 문제를 사용하는 요소 UI 테이블 열 "정의되지 않은 속성 'column_name'을 읽을 수 없습니다" (0)	2022.09.08
Laravel 블레이드가 @include를 통해 문자열과 함께 변수를 전달하면 오류가 발생합니다. (0)	2022.09.08
UPDATE 쿼리에서 3개의 테이블 JOIN을 수행하려면 어떻게 해야 합니까? (0)	2022.09.08
Realm Browser에서 Realm 파일을 표시하려면 어떻게 해야 하나요? (0)	2022.09.08
Python try...제외(쉼표 vs 'as' 제외) (0)	2022.09.08

현재글문자열 끝에서 서브스트링을 제거하려면 어떻게 해야 하나요?

각종 프로그래밍 정보를 다루는 블로그입니다.

php, ajax, PYTHON, C#, C, typescript, mysql, vuex, json, vuejs2, spring-boot, wordpress, ReactJS, Oracle, javascript, java, MariaDB, angularJS,

Today :
Yesterday :

일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

bestcode

문자열 끝에서 서브스트링을 제거하려면 어떻게 해야 하나요?

문자열 끝에서 서브스트링을 제거하려면 어떻게 해야 하나요?

치환 및 카운트 사용

'programing' 카테고리의 다른 글

'programing'의 다른글

티스토리툴바

문자열 끝에서 서브스트링을 제거하려면 어떻게 해야 하나요?

문자열 끝에서 서브스트링을 제거하려면 어떻게 해야 하나요?

치환 및 카운트 사용

'programing' 카테고리의 다른 글

'programing'의 다른글

관련글

티스토리툴바