7/5 학습일지 | 파이썬 | 반복문 연습

Eddie_D 2024. 7. 5. 16:58

연습문제

다음과 같은 영어기사 5개가 있다. 뉴스 기사에 등장하는 모든 단어마다 개수를 세어보자!

계속 막혀서 다른 함수써서 야매(?)로 얻은 답..

from collections import Counter

news1 = "hello it, it's me. I am very happy to hear that?"
news2 = "hello Can you hear me?"
news3 = "Same old same old"
news4 = "So long time no see you too"
news5 = "have you met ted?"

new1=news1.split(" ")
new2=news2.split(" ")
new3=news3.split(" ")
new4=news4.split(" ")
new5=news5.split(" ")

allnews=new1+new2+new3+new4+new5
word_counts=Counter(allnews)

for word, count in word_counts.items():
    print(f"{word}는 {count}번 들어갔습니다")

다시 시도ㅠㅠ!!

1. 딕셔너리 이용하는 방법

news1 = "hello it, it's me. I am very happy to hear that?"
news2 = "hello Can you hear me?"
news3 = "Same old same old"
news4 = "So long time no see you too"
news5 = "hello have you met ted?"

allnews=news1.split(" ")+news2.split(" ")+news3.split(" ")+news4.split(" ")+news5.split(" ")
voca={}
for i in allnews:
    voca[i] = 0

for i in allnews:
    voca[i] = voca[i] +1

1. 단어별로 모두 쪼갠다.

2. voca라는 비어 있는 딕셔너리를 만들고, 그 안에 각 단어가 key값이 되도록 원소를 추가한다.
voca = { 'hello' : 0, 'it,' : 0 ... } 이런 식이 됨.

3. 각 단어가 나올 때마다 딕셔너리의 value 값을 +1 씩 해준다. ***
이 부분이 잘 이해가 안갔는데, allnews에 들어있는 값들이 i안에서 돌면서, 'hello'가 나타나면 0 + 1, 1+1 이런 식으로 더해준다. 알고나니 이게 왜 이해가 안갔는지가 이해가 안갈정도로 단순.

2. 딕셔너리 함수 get 활용하는 방법

a={1:'one', 2:'two'}
b={}
b=a.get(1, 0) #1이라는 키가 있으면 그 값을 가져오고, 만약 키가 없다면 0을 반환하라는 말.
print(b)

voca = {}

for i in allnews:
    voca[i] = voca.get(i, 0) + 1 

voca

그러니까,

voca[i]는 voca안에 들어간 key값인 'hello'를 말하는거고, 이 hello가 어떤 value를 가지고 있는지 = 을 통해 지정하려는 것. voca.get(i, 0) + 1 은, hello라는 값이 있으면 0을 리턴하고, 그 뒤에 1을 더하라는 것. 즉 value 는 1이 된다.
순차적으로 allnews에 들어간 값들이 key값으로 들어가고, 중복된 값이 나오면 그 때마다 +1이 되어서 카운트가 가능!

3. 리스트를 활용하는 방법

news1 = "hello it, it's me. I am very happy to hear that?"
news2 = "hello Can you hear me?"
news3 = "Same old same old"
news4 = "So long time no see you too"
news5 = "hello have you met ted?"

allnews=news1.split(" ")+news2.split(" ")+news3.split(" ")+news4.split(" ")+news5.split(" ")

allnews.count(allnews[0])

countlist= []
for i in range(len(allnews)):
    countlist.append(allnews.count(allnews[i]))
    print(f"{allnews[i]} : {allnews.count(allnews[i])}")

내가 만든 걸로는 중복 제거가 안되서 처음처럼 컬렉션과 카운트를 써야함.

강의에서 제공된 솔루션도 참고했다. 솔루션 보면서도 찬찬히 이해하는데 시간이 꽤 걸림 ㅠ

word_list = []
count_list = []

for word in total_news:
    if word not in word_list: # 처음 word가 등장한 경우.
        word_list.append(word)
        count_list.append(1)
    else: # word가 두번 이상 등장한 경우.
        word_idx = word_list.index(word)
        count_list[word_idx] = count_list[word_idx] + 1

# print
for word, count in zip(word_list, count_list):
    print(word, count)