Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow loading Dashboard #33582

Open
rico132 opened this issue Feb 13, 2025 · 14 comments
Open

Slow loading Dashboard #33582

rico132 opened this issue Feb 13, 2025 · 14 comments
Labels
performance/speed performance issues with slow downs type/bug

Comments

@rico132
Copy link

rico132 commented Feb 13, 2025

Description

Hello there,

Loading the dashboard takes around 3 to 5 seconds. I found a query in the logs that takes over 1s to finish.
The query is from action_list.go:224:GetFeeds() :

SELECT count(*) FROM `action` WHERE act_user_id IN
        (SELECT `user`.id FROM `user` WHERE
            (keep_activity_private=? AND visibility IN (?,?)
            ) OR id=? OR
            (type=? AND `user`.id IN
                (SELECT org_id FROM team_user WHERE uid=?
                )
            )
        ) AND repo_id IN
        (SELECT id FROM repository WHERE
            (`repository`.is_private=? AND `repository`.owner_id NOT IN
                (SELECT id FROM `user` WHERE type=? AND visibility IN (?)
                )
            ) OR `repository`.id IN
                (SELECT repo_id FROM `access` WHERE `access`.user_id=? AND `access`.mode>?
                ) OR `repository`.id IN
                    (SELECT `team_repo`.repo_id FROM team_repo INNER JOIN team_user ON `team_user`.team_id = `team_repo`.team_id WHERE `team_user`.uid=?
                    ) OR `repository`.owner_id=? OR (`repository`.is_private=? AND `repository`.owner_id IN
                        (SELECT `org_user`.org_id FROM org_user WHERE `org_user`.uid=?
                        )
                )
        ) AND user_id=? AND is_deleted=?

As I am not familiar with SQL indexes, is there anything one can do to speed this up?

Gitea Version

1.23.3

Can you reproduce the bug on the Gitea demo site?

No

Log Gist

No response

Screenshots

Image

Git Version

No response

Operating System

No response

How are you running Gitea?

I am using Nomad's Docker driver to deploy gitea/gitea:1.23.3-rootless

Database

MySQL/MariaDB

@kemzeb kemzeb added the performance/speed performance issues with slow downs label Feb 13, 2025
@wxiaoguang
Copy link
Contributor

That's a longstanding problem, you could truncate all data or delete old (1 year ago) data from action table.

@rico132
Copy link
Author

rico132 commented Feb 13, 2025

So, there is no other way of speeding it up?

And another question: why is it a user-specific problem?

@wxiaoguang
Copy link
Contributor

So, there is no other way of speeding it up?

TBH I have no idea.

And another question: why is it a user-specific problem?

The action table contains the "user activity" records (for example: push, open issue, write comment, merge pr, etc). The table might become quite large (millions of rows or even more) and then it slows down the pages when there are requests to load a lot of data from it.

@rico132
Copy link
Author

rico132 commented Feb 13, 2025

The admin account has 38651 actions, my account has 28269 actions, and a recently added member account has 3190 actions.

The admin account's dashboard loads in 50-200ms, my account's dashboard (or any other 'older' member's accounts) loads in 3–5 seconds and the recently added member account's dashboard loads in 50-200ms.

I would agree if the admin had to wait the longest and the recently added member had to wait the shortest amount of time, but it is not in correlation of the actions entries if the admin dashboard loads instantly even though it has more actions than the other accounts.

@wxiaoguang
Copy link
Contributor

You could try to "EXPLAIN" the SQL to see how the DB server executes, same "row count" doesn't mean the same execution plan or time, it is a quite complicated topic.

@rico132
Copy link
Author

rico132 commented Feb 13, 2025

Okay, thank you for taking your time to explain.

@techknowlogick
Copy link
Member

@rico132 do you have your nomad hcl for gitea published publicly? Not asking as a way to help debug this, but only for my own curiosity to see how folks implement it.

@lunny
Copy link
Member

lunny commented Feb 13, 2025

The admin account has 38651 actions, my account has 28269 actions, and a recently added member account has 3190 actions.

The admin account's dashboard loads in 50-200ms, my account's dashboard (or any other 'older' member's accounts) loads in 3–5 seconds and the recently added member account's dashboard loads in 50-200ms.

I would agree if the admin had to wait the longest and the recently added member had to wait the shortest amount of time, but it is not in correlation of the actions entries if the admin dashboard loads instantly even though it has more actions than the other accounts.

Can you get the SQL execute time from your log if you enabled [database]LOG_SQL=true? There are two possible reasons here, one is from the database queries, another is from getting commits information from git data.

@lunny
Copy link
Member

lunny commented Feb 13, 2025

SELECT `id`, `user_id`, `op_type`, `act_user_id`, `repo_id`, `comment_id`, `is_deleted`, `ref_name`, `is_private`, `content`, `created_unix` FROM `action` WHERE act_user_id IN (SELECT `user`.id FROM `user` WHERE keep_activity_private=? AND visibility=?) AND repo_id IN (SELECT id FROM repository WHERE (`repository`.is_private=? AND `repository`.owner_id NOT IN (SELECT id FROM `user` WHERE type=? AND visibility IN (?,?)))) AND `action`.repo_id=? AND (`action`.user_id = `action`.act_user_id) AND is_deleted=? ORDER BY `action`.`created_unix` DESC LIMIT 10


SELECT count(*) FROM `action` WHERE act_user_id IN (SELECT `user`.id FROM `user` WHERE keep_activity_private=? AND visibility=?) AND repo_id IN (SELECT id FROM repository WHERE (`repository`.is_private=? AND `repository`.owner_id NOT IN (SELECT id FROM `user` WHERE type=? AND visibility IN (?,?)))) AND `action`.repo_id=? AND (`action`.user_id = `action`.act_user_id) AND is_deleted=?

@rico132
Copy link
Author

rico132 commented Feb 14, 2025

The admin account has 38651 actions, my account has 28269 actions, and a recently added member account has 3190 actions.
The admin account's dashboard loads in 50-200ms, my account's dashboard (or any other 'older' member's accounts) loads in 3–5 seconds and the recently added member account's dashboard loads in 50-200ms.
I would agree if the admin had to wait the longest and the recently added member had to wait the shortest amount of time, but it is not in correlation of the actions entries if the admin dashboard loads instantly even though it has more actions than the other accounts.

Can you get the SQL execute time from your log if you enabled [database]LOG_SQL=true? There are two possible reasons here, one is from the database queries, another is from getting commits information from git data.

The execution time was 1.036847043s. Sometimes it is 1.6s or 1.9s

SELECT id, user_id, op_type, act_user_id, repo_id, comment_id, is_deleted, ref_name, is_private, content, created_unix FROM action WHERE act_user_id IN (SELECT user.id FROM user WHERE keep_activity_private=? AND visibility=?) AND repo_id IN (SELECT id FROM repository WHERE (repository.is_private=? AND repository.owner_id NOT IN (SELECT id FROM user WHERE type=? AND visibility IN (?,?)))) AND action.repo_id=? AND (action.user_id = action.act_user_id) AND is_deleted=? ORDER BY action.created_unix DESC LIMIT 10

SELECT count(*) FROM action WHERE act_user_id IN (SELECT user.id FROM user WHERE keep_activity_private=? AND visibility=?) AND repo_id IN (SELECT id FROM repository WHERE (repository.is_private=? AND repository.owner_id NOT IN (SELECT id FROM user WHERE type=? AND visibility IN (?,?)))) AND action.repo_id=? AND (action.user_id = action.act_user_id) AND is_deleted=?

What am I supposed to do here?

@rico132
Copy link
Author

rico132 commented Feb 14, 2025

@techknowlogick The hcl looks something like this (I removed some URLs):

job "gitea" {
  type = "service"
  datacenters = ["*"]
  constraint {
    attribute = node.class
    value     = "app"
  }

  vault {
    policies = ["gitea"]
  }

  group "gitea" {
    network {
      port "http" {
        to           = 
        host_network = "internal"
      }
      port "ssh" {
        to           = 
        host_network = "internal"
      }
    }

    volume "gitea-app-data" {
      type            = "csi"
      attachment_mode = "file-system"
      access_mode     = "multi-node-multi-writer"
      read_only       = false
      source          = "gitea-app-data"
    }

    service {
      name = "git"
      port = "http"
      tags = ["http"]
      check {
        type            = "http"
        protocol        = "https"
        tls_server_name = ""
        port            = "http"
        path            = "/api/healthz"
        interval        = "10s"
        timeout         = "3s"
      }
    }

    service {
      name = "git"
      port = "ssh"
      tags = ["ssh"]
      check {
        type     = "tcp"
        port     = "ssh"
        interval = "10s"
        timeout  = "3s"
      }
    }

    task "ca-prepare" {
      driver = "docker"

      user = "root"

      config {
        image = "gitea/gitea:1.23.3"
        args = ["bash", "${NOMAD_TASK_DIR}/prepare.sh"]
      }

      lifecycle {
        hook    = "prestart"
        sidecar = false
      }

      template {
        data        = <<EOF
CA_NAME=
cp ${NOMAD_TASK_DIR}/${CA_NAME}.crt /usr/local/share/ca-certificates/${CA_NAME}.crt
update-ca-certificates
cp /etc/ssl/certs/ca-certificates.crt ${NOMAD_ALLOC_DIR}/ca-certificates.crt
EOF
        destination = "local/prepare.sh"
        change_mode = "restart"
      }

      artifact {
        source = "ca"
        options {
          checksum = ""
        }
      }
    }

    task "prepare-volume" {
      driver = "docker"
      volume_mount {
        volume      = "gitea-app-data"
        destination = "/var/lib/gitea"
        read_only   = false
      }
      config {
        image   = "busybox:latest"
        command = "sh"
        args = ["-c", "chown -R 1000:1000 /var/lib/gitea"]
      }

      lifecycle {
        hook    = "prestart"
        sidecar = false
      }
    }

    task "gitea" {
      driver = "docker"

      config {
        image = "gitea/gitea:1.23.3-rootless"
        init  = true
        volumes = [
          "./local/app.ini:/etc/gitea/app.ini",
          "./local/templates/sidebar.tmpl:/var/lib/gitea/custom/templates/repo/issue/view_content/sidebar.tmpl",
          "../alloc/ca-certificates.crt:/etc/ssl/certs/ca-certificates.crt"
        ]
        ports = [
          "http",
          "ssh"
        ]
      }

      volume_mount {
        volume      = "gitea-app-data"
        destination = "/var/lib/gitea"
        read_only   = false
      }

      template {
        data = file("./gitea/app.ini")
        destination = "${NOMAD_TASK_DIR}/app.ini"
        change_mode = "restart"
      }

      template {
        data = file("./gitea/templates/sidebar.tmpl")
        destination = "local/templates/sidebar.tmpl"
        left_delimiter  = "~$"
        right_delimiter  = "$~"
        change_mode = "restart"
      }

      template {
        data        = <<EOF
{{ with secret "" "common_name=" "ttl=24h" }}
{{ .Data.certificate }}
{{ .Data.issuing_ca }}
{{ .Data.private_key }}
{{ end }}
EOF
            destination = "${NOMAD_SECRETS_DIR}/tls.pem"
        change_mode = "restart"
      }

      template {
        data        = <<EOF
{{ with secret "" }}
{{ index .Data.data "private-key" }}
{{ end }}
EOF
        destination = "${NOMAD_SECRETS_DIR}/gitea.rsa"
        change_mode = "restart"
      }

      template {
        data        = <<EOF
{{ with secret "" }}
{{ index .Data.data "public-key" }}
{{ end }}
EOF
        destination = "${NOMAD_SECRETS_DIR}/gitea.rsa.pub"
        change_mode = "restart"
      }

      resources {
        memory = 25600
        cpu    = 16000
      }
    }
  }

  group "runner" {
    count = 4

    volume "gitea-runner-data" {
      type      = "host"
      read_only = false
      source    = "gitea-runner-data"
      per_alloc = true
    }

    volume "gitea-runner-docker" {
      type      = "host"
      read_only = false
      source    = "gitea-runner-docker"
      per_alloc = true
    }

    service {
      name = "gitea-runner"
    }

    task "ca-prepare" {
      driver = "docker"

      user = "root"

      config {
        image = "gitea/act_runner:0.2.11"
        args = ["bash", "${NOMAD_TASK_DIR}/prepare.sh"]
        entrypoint = [""]
      }

      lifecycle {
        hook    = "prestart"
        sidecar = false
      }

      template {
        data        = <<EOF
CA_NAME=
cp ${NOMAD_TASK_DIR}/${CA_NAME}.crt /usr/local/share/ca-certificates/${CA_NAME}.crt
update-ca-certificates
cp /etc/ssl/certs/ca-certificates.crt ${NOMAD_ALLOC_DIR}/ca-certificates.crt
EOF
        destination = "local/prepare.sh"
        change_mode = "restart"
      }

      artifact {
        source = "ca"
        options {
          checksum = ""
        }
      }
    }

    task "gitea-runner" {
      driver = "docker"

      volume_mount {
        volume      = "gitea-runner-data"
        destination = "/data"
        read_only   = false
      }

      volume_mount {
        volume      = "gitea-runner-docker"
        destination = "/home/rootless/.local/share/docker"
        read_only   = false
      }

      template {
        data = file("./gitea/config.yaml")
        destination = "${NOMAD_TASK_DIR}/config.yaml"
        change_mode = "restart"
      }

      template {
        data        = <<EOF
{{ with secret "" }}
GITEA_RUNNER_REGISTRATION_TOKEN="{{ .Data.data.token }}"
{{ end }}
EOF
        destination = "secrets/vault.env"
        env         = true
      }

      config {
        image      = "gitea/act_runner:0.2.11-dind-rootless"
        privileged = true
        volumes = [
          "../alloc/ca-certificates.crt:/etc/ssl/certs/ca-certificates.crt"
        ]
      }

      env {
        CONFIG_FILE        = "/local/config.yaml"
        GITEA_INSTANCE_URL = ""
        GITEA_RUNNER_NAME  = "runner-${NOMAD_ALLOC_INDEX}"
        DOCKER_HOST        = "unix:///var/run/user/1000/docker.sock"
      }
      resources {
        memory = 2048
        cpu    = 2000
      }
    }
  }
}

@lunny
Copy link
Member

lunny commented Feb 14, 2025

The admin account has 38651 actions, my account has 28269 actions, and a recently added member account has 3190 actions.
The admin account's dashboard loads in 50-200ms, my account's dashboard (or any other 'older' member's accounts) loads in 3–5 seconds and the recently added member account's dashboard loads in 50-200ms.
I would agree if the admin had to wait the longest and the recently added member had to wait the shortest amount of time, but it is not in correlation of the actions entries if the admin dashboard loads instantly even though it has more actions than the other accounts.

Can you get the SQL execute time from your log if you enabled [database]LOG_SQL=true? There are two possible reasons here, one is from the database queries, another is from getting commits information from git data.

The execution time was 1.036847043s. Sometimes it is 1.6s or 1.9s

SELECT id, user_id, op_type, act_user_id, repo_id, comment_id, is_deleted, ref_name, is_private, content, created_unix FROM action WHERE act_user_id IN (SELECT user.id FROM user WHERE keep_activity_private=? AND visibility=?) AND repo_id IN (SELECT id FROM repository WHERE (repository.is_private=? AND repository.owner_id NOT IN (SELECT id FROM user WHERE type=? AND visibility IN (?,?)))) AND action.repo_id=? AND (action.user_id = action.act_user_id) AND is_deleted=? ORDER BY action.created_unix DESC LIMIT 10
SELECT count(*) FROM action WHERE act_user_id IN (SELECT user.id FROM user WHERE keep_activity_private=? AND visibility=?) AND repo_id IN (SELECT id FROM repository WHERE (repository.is_private=? AND repository.owner_id NOT IN (SELECT id FROM user WHERE type=? AND visibility IN (?,?)))) AND action.repo_id=? AND (action.user_id = action.act_user_id) AND is_deleted=?

What am I supposed to do here?

I just record the slow SQL here and will investigate it.

@uxbug
Copy link

uxbug commented Feb 15, 2025

my issue, also gitea 1.23.3
login to home dashbord or open home page takes about 2 minutes
#31698 (comment)

@rico132
Copy link
Author

rico132 commented Feb 17, 2025

The admin account has 38651 actions, my account has 28269 actions, and a recently added member account has 3190 actions.
The admin account's dashboard loads in 50-200ms, my account's dashboard (or any other 'older' member's accounts) loads in 3–5 seconds and the recently added member account's dashboard loads in 50-200ms.
I would agree if the admin had to wait the longest and the recently added member had to wait the shortest amount of time, but it is not in correlation of the actions entries if the admin dashboard loads instantly even though it has more actions than the other accounts.

Can you get the SQL execute time from your log if you enabled [database]LOG_SQL=true? There are two possible reasons here, one is from the database queries, another is from getting commits information from git data.

The execution time was 1.036847043s. Sometimes it is 1.6s or 1.9s

SELECT id, user_id, op_type, act_user_id, repo_id, comment_id, is_deleted, ref_name, is_private, content, created_unix FROM action WHERE act_user_id IN (SELECT user.id FROM user WHERE keep_activity_private=? AND visibility=?) AND repo_id IN (SELECT id FROM repository WHERE (repository.is_private=? AND repository.owner_id NOT IN (SELECT id FROM user WHERE type=? AND visibility IN (?,?)))) AND action.repo_id=? AND (action.user_id = action.act_user_id) AND is_deleted=? ORDER BY action.created_unix DESC LIMIT 10
SELECT count(*) FROM action WHERE act_user_id IN (SELECT user.id FROM user WHERE keep_activity_private=? AND visibility=?) AND repo_id IN (SELECT id FROM repository WHERE (repository.is_private=? AND repository.owner_id NOT IN (SELECT id FROM user WHERE type=? AND visibility IN (?,?)))) AND action.repo_id=? AND (action.user_id = action.act_user_id) AND is_deleted=?

What am I supposed to do here?

I just record the slow SQL here and will investigate it.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance/speed performance issues with slow downs type/bug
Projects
None yet
Development

No branches or pull requests

6 participants