[실습] 한글, 영어 검색 / 동의어 기반 검색 / 필터링

author

JSCODE 박재성

✅ 요구사항 반영하기

한글, 영어 둘 다 검색이 잘 돼야 한다.

상품명에 대해서는 아래 값을 동의어로 인식할 수 있어야 한다.


samsung, 삼성
apple, 애플
노트북, 랩탑, 컴퓨터, computer, laptop, notebook
전화기, 휴대폰, 핸드폰, 스마트폰, 휴대전화, phone, smartphone, mobile phone, cell phone
아이폰, iphone
맥북, 맥, macbook, mac

카테고리(category), 가격(price)으로 필터링을 할 수 있어야 한다.

특정 카테고리 내에서만 검색할 수 있어야 한다.

특정 가격 범위 내에서만 검색할 수 있어야 한다.

한글, 영어 둘 다 검색이 잘 되도록 만들기

상품명, 상품 설명, 카테고리 필드에 들어가는 한글로 이루어진 값도 적절하게 토큰을 분리하려면 기본으로 설정되는 standard Analyzer가 아닌 Nori Analyzer를 활용해야 한다. Nori Analyzer는 다음과 같이 설정할 수 있다고 했다.


// 방법 1
"analyzer": "nori"

// 방법 2 (nori analyzer의 구성을 직접 명시)
"char_filter": [], 
"tokenizer": "nori_tokenizer", 
"filter": ["nori_part_of_speech", "nori_readingform", "lowercase"]

기존 Nori Analyzer의 구성을 요구사항에 맞게 커스텀해서 사용하려면 2번째 방법을 사용해야 한다. 2번째 방법을 활용해 매핑을 다시 정의해서 인덱스를 재생성하자.


DELETE /products

PUT /products
{
  "settings": {
    "analysis": {
      // 커스텀 애널라이저 정의
      "analyzer": {
        "products_name_analyzer": {
          "char_filter": [],
          "tokenizer": "nori_tokenizer",
          "filter": ["nori_part_of_speech", "nori_readingform", "lowercase"]
        },
        "products_description_analyzer": {
          "char_filter": ["html_strip"],
          "tokenizer": "nori_tokenizer",
          "filter": ["nori_part_of_speech", "nori_readingform", "lowercase"]
        },
        "products_category_analyzer": {
          "char_filter": [],
          "tokenizer": "nori_tokenizer",
          "filter": ["nori_part_of_speech", "nori_readingform", "lowercase"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "id": {
        "type": "long"
      },
      "name": {
        "type": "text", // 유연한 검색 필요
        "analyzer": "products_name_analyzer"
      },
      "description": {
        "type": "text", // 유연한 검색 필요
        "analyzer": "products_description_analyzer"
      },
      "price": {
        "type": "integer" // 10억 이하의 정수
      },
      "rating": {
        "type": "double" // 실수(소수점을 가진 숫자 포함)
      },
      "category": {
        "type": "text", // 유연한 검색 필요 
        "analyzer": "products_category_analyzer"
      }
    }
  }
}

잘 작동하는 지 Analyze API로 테스트해보자.


GET /products/_analyze
{
  "field": "name",
  "text": "특수가전제품 Samsung TV"
}

GET /products/_analyze
{
  "field": "description",
  "text": "<h1>특수가전제품 Samsung TV</h1>"
}

GET /products/_analyze
{
  "field": "category",
  "text": "특수가전제품 Samsung TV"
}

동의어로 검색할 수 있게 만들기


DELETE /products

PUT /products
{
  "settings": {
    "analysis": {
		  // 필터 정의
		  "filter": {
        "product_synonyms": {
          "type": "synonym",
          "synonyms": [
            "samsung, 삼성",
						"apple, 애플",
						"노트북, 랩탑, 컴퓨터, computer, laptop, notebook",
						"전화기, 휴대폰, 핸드폰, 스마트폰, 휴대전화, phone, smartphone, mobile phone, cell phone",
						"아이폰, iphone",
						"맥북, 맥, macbook, mac"
          ]
        }
      },
		  
      // 커스텀 애널라이저 정의
      "analyzer": {
        "products_name_analyzer": {
          "char_filter": [],
          "tokenizer": "nori_tokenizer",
          "filter": [
            "nori_part_of_speech", 
            "nori_readingform", 
            "lowercase",
            "product_synonyms"
          ]
        },
        "products_description_analyzer": {
          "char_filter": ["html_strip"],
          "tokenizer": "nori_tokenizer",
          "filter": [
            "nori_part_of_speech", 
            "nori_readingform", 
            "lowercase"
          ]
        },
        "products_category_analyzer": {
          "char_filter": [],
          "tokenizer": "nori_tokenizer",
          "filter": [
            "nori_part_of_speech", 
            "nori_readingform", 
            "lowercase"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "id": {
        "type": "long"
      },
      "name": {
        "type": "text", // 유연한 검색 필요
        "analyzer": "products_name_analyzer"
      },
      "description": {
        "type": "text", // 유연한 검색 필요
        "analyzer": "products_description_analyzer"
      },
      "price": {
        "type": "integer" // 10억 이하의 정수
      },
      "rating": {
        "type": "double" // 실수(소수점을 가진 숫자 포함)
      },
      "category": {
        "type": "text", // 유연한 검색 필요 
        "analyzer": "products_category_analyzer"
      }
    }
  }
}

잘 작동하는 지 Analyze API로 테스트해보자.


GET /products/_analyze
{
  "field": "name",
  "text": "삼성 애플 노트북 전화기 아이폰 맥북"
}

가격으로 필터링 할 수 있게 만들기

[기존 검색 쿼리]


POST /products/_search
{
  "query": {
    "multi_match": {
      "query": "lg",
      "fields": [
        "name^3",
        "description^1",
        "category^2"
      ]
    }
  }
}

기존 검색 쿼리에서 가격(price)로 필터링 하려면 아래와 같이 쿼리를 수정하면 된다.


POST /products/_search
{
  "query": {
    "bool": { // 2가지 이상의 조건을 적용시켜야 할 때
      "must": { // score와 관련 있는 쿼리 (유연한 검색 필요)
        "multi_match": {
          "query": "lg",
          "fields": [
            "name^3",
            "description^1",
            "category^2"
          ]
        }
      },
      "filter": { // score와 관련 없는 쿼리 (정확한 값 비교 필요)
        "range": {
          "price": {
            "gte": 10000,
            "lte": 50000
          }
        }
      }
    }
  }
}

카테고리로 필터링하기

category를 기준으로 필터링하려면 타입이 keyword 타입이어야 한다. 왜냐하면 category를 기준으로 필터링하려면 정확히 일치하는 값으로 비교해서 필터링을 해야 하기 때문이다.

하지만 category 필드는 관련된 상품을 유연하게 검색할 때도 사용하기 때문에 text 타입으로도 활용할 수 있어야 한다.

이러한 특징 때문에 category 필드에 text와 keyword 타입을 동시에 적용시켜야 한다.


DELETE /products

PUT /products
{
  "settings": {
    "analysis": {
		  // 필터 정의
		  "filter": {
        "product_synonyms": {
          "type": "synonym",
          "synonyms": [
            "samsung, 삼성",
						"apple, 애플",
						"노트북, 랩탑, 컴퓨터, computer, laptop, notebook",
						"전화기, 휴대폰, 핸드폰, 스마트폰, 휴대전화, phone, smartphone, mobile phone, cell phone",
						"아이폰, iphone",
						"맥북, 맥, macbook, mac"
          ]
        }
      },
		  
      // 커스텀 애널라이저 정의
      "analyzer": {
        "products_name_analyzer": {
          "char_filter": [],
          "tokenizer": "nori_tokenizer",
          "filter": [
            "nori_part_of_speech", 
            "nori_readingform", 
            "lowercase",
            "product_synonyms"
          ]
        },
        "products_description_analyzer": {
          "char_filter": ["html_strip"],
          "tokenizer": "nori_tokenizer",
          "filter": [
            "nori_part_of_speech", 
            "nori_readingform", 
            "lowercase"
          ]
        },
        "products_category_analyzer": {
          "char_filter": [],
          "tokenizer": "nori_tokenizer",
          "filter": [
            "nori_part_of_speech", 
            "nori_readingform", 
            "lowercase"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "id": {
        "type": "long"
      },
      "name": {
        "type": "text", // 유연한 검색 필요
        "analyzer": "products_name_analyzer"
      },
      "description": {
        "type": "text", // 유연한 검색 필요
        "analyzer": "products_description_analyzer"
      },
      "price": {
        "type": "integer" // 10억 이하의 정수
      },
      "rating": {
        "type": "double" // 실수(소수점을 가진 숫자 포함)
      },
      "category": {
        "type": "text", // 유연한 검색 필요 
        "analyzer": "products_category_analyzer",
        // 멀티 필드로 keyword 타입을 추가
        "fields": {
          "raw": {
            "type": "keyword"
          }
        }
      }
    }
  }
}

테스트를 하기 위해 더미 데이터를 먼저 넣자.


POST /products/_doc
{
  "id": 1,
  "name": "LG OLED TV 42인치",
  "description": "<p>선명한 화질의 LG OLED TV입니다.</p>",
  "price": 3000,
  "rating": 4.8,
  "category": "가전제품"
}

POST /products/_doc
{
  "id": 2,
  "name": "LG 냉장고 300L",
  "description": "<p>신선함을 오래 유지하는 LG 냉장고</p>",
  "price": 49000,
  "rating": 4.5,
  "category": "가전제품"
}

POST /products/_doc
{
  "id": 3,
  "name": "삼성 냉장고",
  "description": "<p>삼성의 최신 기술이 적용된 냉장고</p>",
  "price": 20000,
  "rating": 4.7,
  "category": "가전제품"
}

POST /products/_doc
{
  "id": 4,
  "name": "LG 스마트폰",
  "description": "<p>LG의 최신 스마트폰 모델</p>",
  "price": 25000,
  "rating": 4.2,
  "category": "휴대전화"
}

필터링 기능이 잘 작동하는 지 확인하기 위해 아래 쿼리를 실행시켜보자.


POST /products/_search
{
  "query": {
    "bool": {
      "must": {
        "multi_match": {
          "query": "LG",
          "fields": [
            "name^3",
            "description^1",
            "category^2"
          ]
        }
      },
      "filter": [
        {
          "term": {
            "category.raw": "가전제품"
          }
        },
        {
          "range": {
            "price": {
              "gte": 10000,
              "lte": 50000
            }
          }
        }
      ]
    }
  }
}

author

JSCODE 박재성

category

Elasticsearch

createdAt

Dec 6, 2025 03:54 AM

isPublic

series

실전에서 바로 써먹는 Elasticsearch 입문 (검색 최적화편)

slug

type

series-footer

updatedAt

📎

이 글은 실전에서 바로 써먹는 Elasticsearch 입문 (검색 최적화편) 강의의 수업 자료 중 일부입니다.

samsung, 삼성 apple, 애플 노트북, 랩탑, 컴퓨터, computer, laptop, notebook 전화기, 휴대폰, 핸드폰, 스마트폰, 휴대전화, phone, smartphone, mobile phone, cell phone 아이폰, iphone 맥북, 맥, macbook, mac

// 방법 1 "analyzer": "nori" // 방법 2 (nori analyzer의 구성을 직접 명시) "char_filter": [], "tokenizer": "nori_tokenizer", "filter": ["nori_part_of_speech", "nori_readingform", "lowercase"]

DELETE /products PUT /products { "settings": { "analysis": { // 커스텀 애널라이저 정의 "analyzer": { "products_name_analyzer": { "char_filter": [], "tokenizer": "nori_tokenizer", "filter": ["nori_part_of_speech", "nori_readingform", "lowercase"] }, "products_description_analyzer": { "char_filter": ["html_strip"], "tokenizer": "nori_tokenizer", "filter": ["nori_part_of_speech", "nori_readingform", "lowercase"] }, "products_category_analyzer": { "char_filter": [], "tokenizer": "nori_tokenizer", "filter": ["nori_part_of_speech", "nori_readingform", "lowercase"] } } } }, "mappings": { "properties": { "id": { "type": "long" }, "name": { "type": "text", // 유연한 검색 필요 "analyzer": "products_name_analyzer" }, "description": { "type": "text", // 유연한 검색 필요 "analyzer": "products_description_analyzer" }, "price": { "type": "integer" // 10억 이하의 정수 }, "rating": { "type": "double" // 실수(소수점을 가진 숫자 포함) }, "category": { "type": "text", // 유연한 검색 필요 "analyzer": "products_category_analyzer" } } } }

GET /products/_analyze { "field": "name", "text": "특수가전제품 Samsung TV" } GET /products/_analyze { "field": "description", "text": "<h1>특수가전제품 Samsung TV</h1>" } GET /products/_analyze { "field": "category", "text": "특수가전제품 Samsung TV" }

DELETE /products PUT /products { "settings": { "analysis": { // 필터 정의 "filter": { "product_synonyms": { "type": "synonym", "synonyms": [ "samsung, 삼성", "apple, 애플", "노트북, 랩탑, 컴퓨터, computer, laptop, notebook", "전화기, 휴대폰, 핸드폰, 스마트폰, 휴대전화, phone, smartphone, mobile phone, cell phone", "아이폰, iphone", "맥북, 맥, macbook, mac" ] } }, // 커스텀 애널라이저 정의 "analyzer": { "products_name_analyzer": { "char_filter": [], "tokenizer": "nori_tokenizer", "filter": [ "nori_part_of_speech", "nori_readingform", "lowercase", "product_synonyms" ] }, "products_description_analyzer": { "char_filter": ["html_strip"], "tokenizer": "nori_tokenizer", "filter": [ "nori_part_of_speech", "nori_readingform", "lowercase" ] }, "products_category_analyzer": { "char_filter": [], "tokenizer": "nori_tokenizer", "filter": [ "nori_part_of_speech", "nori_readingform", "lowercase" ] } } } }, "mappings": { "properties": { "id": { "type": "long" }, "name": { "type": "text", // 유연한 검색 필요 "analyzer": "products_name_analyzer" }, "description": { "type": "text", // 유연한 검색 필요 "analyzer": "products_description_analyzer" }, "price": { "type": "integer" // 10억 이하의 정수 }, "rating": { "type": "double" // 실수(소수점을 가진 숫자 포함) }, "category": { "type": "text", // 유연한 검색 필요 "analyzer": "products_category_analyzer" } } } }

POST /products/_search { "query": { "bool": { // 2가지 이상의 조건을 적용시켜야 할 때 "must": { // score와 관련 있는 쿼리 (유연한 검색 필요) "multi_match": { "query": "lg", "fields": [ "name^3", "description^1", "category^2" ] } }, "filter": { // score와 관련 없는 쿼리 (정확한 값 비교 필요) "range": { "price": { "gte": 10000, "lte": 50000 } } } } } }

POST /products/_doc { "id": 1, "name": "LG OLED TV 42인치", "description": "<p>선명한 화질의 LG OLED TV입니다.</p>", "price": 3000, "rating": 4.8, "category": "가전제품" } POST /products/_doc { "id": 2, "name": "LG 냉장고 300L", "description": "<p>신선함을 오래 유지하는 LG 냉장고</p>", "price": 49000, "rating": 4.5, "category": "가전제품" } POST /products/_doc { "id": 3, "name": "삼성 냉장고", "description": "<p>삼성의 최신 기술이 적용된 냉장고</p>", "price": 20000, "rating": 4.7, "category": "가전제품" } POST /products/_doc { "id": 4, "name": "LG 스마트폰", "description": "<p>LG의 최신 스마트폰 모델</p>", "price": 25000, "rating": 4.2, "category": "휴대전화" }

POST /products/_search { "query": { "bool": { "must": { "multi_match": { "query": "LG", "fields": [ "name^3", "description^1", "category^2" ] } }, "filter": [ { "term": { "category.raw": "가전제품" } }, { "range": { "price": { "gte": 10000, "lte": 50000 } } } ] } } }