Why is it necessary to check if the first two bytes of data are \r\n? #3300

kwsy · 2024-09-30T06:47:40Z

`def parse(self, unreader):
buf = io.BytesIO()
self.get_data(unreader, buf, stop=True)

  # get request line
  line, rbuf = self.read_line(unreader, buf, self.limit_request_line)

  # proxy protocol
  if self.proxy_protocol(bytes_to_str(line)):
      # get next request line
      buf = io.BytesIO()
      buf.write(rbuf)
      line, rbuf = self.read_line(unreader, buf, self.limit_request_line)

  self.parse_request_line(line)           # 解析请求行， 也就是第一行
  buf = io.BytesIO()
  buf.write(rbuf)

  # Headers   接下来解析headers
  data = buf.getvalue()
  idx = data.find(b"\r\n\r\n")

  done = data[:2] == b"\r\n"
  while True:
      idx = data.find(b"\r\n\r\n")
      done = data[:2] == b"\r\n"    # 这条语句的依据是什么？

      if idx < 0 and not done:
          self.get_data(unreader, buf)
          data = buf.getvalue()
          if len(data) > self.max_buffer_headers:
              raise LimitRequestHeaders("max buffer headers")
      else:
          break

  if done:
      self.unreader.unread(data[2:])
      return b""

  self.headers = self.parse_headers(data[:idx])

  ret = data[idx + 4:]    # body的部分
  buf = None
  return ret

`

parse is the method of Request class ，
It reads the message header part of the HTTP request. There is always \r\n\r\n between the message header and the message body. However, the method checks whether the first two bytes of data are \r\n. This is strange. Isn't the end marker of the message header only \r\n\r\n? Why, even if \r\n\r\n is not found, is the message header considered to have ended when data starts with \r\n? Is there any basis for doing this?

The text was updated successfully, but these errors were encountered:

pajod · 2024-10-03T00:30:21Z

Why, even if \r\n\r\n is not found, is the message header considered to have ended?

Because on reading the request (and PROXY) line, its trailing \r\n was discarded from buf. Thus done distinguishes a) expecting two consecutive \r\n marking the last header and b) just expecting a single \r\n at the start to signal a HTTP/1.0 request with zero headers.

The code could be refactored to remove most of the complexity and be more efficient for invalid/oversize/partial input - but I do not see a logical flaw. If you do, please show the request that you consider incorrectly parsed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is it necessary to check if the first two bytes of data are \r\n? #3300

Why is it necessary to check if the first two bytes of data are \r\n? #3300

kwsy commented Sep 30, 2024 •

edited

Loading

pajod commented Oct 3, 2024

Why is it necessary to check if the first two bytes of data are \r\n? #3300

Why is it necessary to check if the first two bytes of data are \r\n? #3300

Comments

kwsy commented Sep 30, 2024 • edited Loading

pajod commented Oct 3, 2024

kwsy commented Sep 30, 2024 •

edited

Loading