Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: 在SPU中新增协议 #922

Open
c-doubley opened this issue Nov 25, 2024 · 13 comments
Open

[Feature]: 在SPU中新增协议 #922

c-doubley opened this issue Nov 25, 2024 · 13 comments

Comments

@c-doubley
Copy link

Feature Request Type

Build/Install

Have you searched existing issues?

Yes

Is your feature request related to a problem?

A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe features you want to add to SPU

A clear and concise description of what you want to happen.

Describe features you want to add to SPU

在SPU中添加新协议的时候有一个API测试,包含了算术算子,布尔算子以及算术和布尔的转化,但是我现在想新增一个协议,这个协议只包含算术运算的算子,如果这样做的话,在上层调用的时候是不是会容易报错,我想知道哪些操作可能会导致报错。
比如在上层进行一个简单神经网络训练,在什么情况下可能会调用到底层布尔算子,这是可以避免的吗,比如仅使用算术算子来完成一个模型训练,这是可以人为在上层控制的吗,还是一定会有某些计算会涉及到布尔的算子呢

@deadlywing
Copy link
Contributor

简单来说,只能支持加减;乘法也得看你是不是实现了trunc;至于其他数学函数,如exp,log等,也是完全跑不了

@c-doubley
Copy link
Author

参考spdz2k,把下面的算术算子都实现了
ctx->prot()
->regKernel<spdz2k::P2A, spdz2k::A2P, spdz2k::A2V, spdz2k::V2A,
spdz2k::NotA, spdz2k::AddAP, spdz2k::AddAA, spdz2k::MulAP,
spdz2k::MulAA, spdz2k::MatMulAP, spdz2k::MatMulAA,
spdz2k::LShiftA, spdz2k::TruncA, spdz2k::RandA>();
加减和乘法应该都没问题,像exp,log这种也不能用多项式近似来解决吗

@deadlywing
Copy link
Contributor

几乎所有数学函数都强依赖a2b,因为要做range reduction

@c-doubley
Copy link
Author

好的,感谢您的解答

@zhangwaer
Copy link

几乎所有数学函数都强依赖a2b,因为要做range reduction

Hello!!!, i am curious about why the following a2b needs to generate r0 and r1? According to my understanding, the a2b here can be converted locally to execute the subsequent msb operations by different parties, and r0 and r1 are not transfered to other parties ,so why we need to generate two random numbers. could you please give me some advice, thanks!!!

NdArrayRef A2B::proc(KernelEvalContext* ctx, const NdArrayRef& x) const {
const auto field = x.eltype().as()->field();
auto* comm = ctx->getState();
auto* prg_state = ctx->getState();

std::vector bshrs;
const auto bty = makeType(field);
for (size_t idx = 0; idx < comm->getWorldSize(); idx++) {
auto [r0, r1] =
prg_state->genPrssPair(field, x.shape(), PrgState::GenPrssCtrl::Both);
auto b = ring_xor(r0, r1).as(bty);

if (idx == comm->getRank()) {
  ring_xor_(b, x);
}
bshrs.push_back(b.as(bty));

}

NdArrayRef res = vreduce(bshrs.begin(), bshrs.end(),
[&](const NdArrayRef& xx, const NdArrayRef& yy) {
return wrap_add_bb(ctx->sctx(), xx, yy);
});
return res.as(bty);
}

NdArrayRef A2B::proc(KernelEvalContext* ctx, const NdArrayRef& x) const {
const auto field = x.eltype().as()->field();
auto* comm = ctx->getState();
auto* prg_state = ctx->getState();

std::vector bshrs;
const auto bty = makeType(field);
for (size_t idx = 0; idx < comm->getWorldSize(); idx++) {
auto [r0, r1] =
prg_state->genPrssPair(field, x.shape(), PrgState::GenPrssCtrl::Both);
auto b = ring_xor(r0, r1).as(bty);

if (idx == comm->getRank()) {
  ring_xor_(b, x);
}
bshrs.push_back(b.as(bty));

}

NdArrayRef res = vreduce(bshrs.begin(), bshrs.end(),
[&](const NdArrayRef& xx, const NdArrayRef& yy) {
return wrap_add_bb(ctx->sctx(), xx, yy);
});
return res.as(bty);
}

@deadlywing
Copy link
Contributor

#916 may help.

@c-doubley c-doubley reopened this Jan 1, 2025
@c-doubley
Copy link
Author

c-doubley commented Jan 1, 2025

你好,我的协议中的算术单元测试已经通过了,然后我写好了配置文件,在example/python目录下新增testProto.py文件来测试我的新协议,目前加法,数乘以及求反等都可以实现,但是在进行秘密乘法的时候出现了超时的报错。
下面是报错信息,我不太理解的是为什么单元测试中的MulAA已经通过了测试但是在应用中使用的时候却报错了,我需要定位一下是算法的内部错误,还是因为缺少了一些其他算子,比如布尔的计算,因为注意到报错信息中有spu::mpc::BinaryKernel::evaluate()+0xfffd17f59754

[2025-01-01 10:23:09,144] [ForkServerProcess-3] Traceback (most recent call last):
  File "/home/cyy/.cache/bazel/_bazel_cyy/00d760fb7a6e7f0044fd9bf53941d102/execroot/spulib/bazel-out/aarch64-opt/bin/examples/python/utils/nodectl.runfiles/spulib/spu/utils/distributed_impl.py", line 326, in Run
    ret_objs = fn(self, *args, **kwargs)
  File "/home/cyy/.cache/bazel/_bazel_cyy/00d760fb7a6e7f0044fd9bf53941d102/execroot/spulib/bazel-out/aarch64-opt/bin/examples/python/utils/nodectl.runfiles/spulib/spu/utils/distributed_impl.py", line 589, in builtin_spu_run
    rt.run(spu_exec)
  File "/home/cyy/.cache/bazel/_bazel_cyy/00d760fb7a6e7f0044fd9bf53941d102/execroot/spulib/bazel-out/aarch64-opt/bin/examples/python/utils/nodectl.runfiles/spulib/spu/api.py", line 44, in run
    return self._vm.Run(executable.SerializeToString())
RuntimeError: what: 
        [external/yacl/yacl/link/transport/channel.cc:427] Get data timeout, key=root:P2P-6:0->2
stacktrace: 
#0 yacl::link::transport::Channel::Recv()+0xfffd18090b10
#1 yacl::link::Context::RecvInternal()+0xfffd18085870
#2 yacl::link::Context::Recv()+0xfffd18086b8c
#3 spu::mpc::Communicator::recv<>()+0xfffd175dcf08
#4 spu::mpc::sota4::MulAA::proc()::{lambda()#1}::operator()()::{lambda()#2}::operator()()+0xfffd175fc520
#5 spu::mpc::sota4::MulAA::proc()+0xfffd17606178
#6 spu::mpc::BinaryKernel::evaluate()+0xfffd17f59754
#7 spu::dynDispatch<>()+0xfffd17fd8b54
#8 spu::mpc::tiled<>()+0xfffd17ff7708
#9 spu::mpc::tiledDynDispatch<>()+0xfffd17ff7f8c
#10 spu::mpc::mul_aa()+0xfffd17ff80fc
#11 spu::mpc::mul_ss()+0xfffd17fda260
#12 spu::kernel::hal::_mul_ss()+0xfffd17fc9604
#13 spu::kernel::hal::_mul()+0xfffd17fb8f6c
#14 spu::kernel::hal::i_mul()+0xfffd17f0ff8c
#15 spu::kernel::hal::(anonymous namespace)::dtypeBinaryDispatch<>()+0xfffd17eee324



[2025-01-01 10:23:09,145] [ForkServerProcess-4] Traceback (most recent call last):
  File "/home/cyy/.cache/bazel/_bazel_cyy/00d760fb7a6e7f0044fd9bf53941d102/execroot/spulib/bazel-out/aarch64-opt/bin/examples/python/utils/nodectl.runfiles/spulib/spu/utils/distributed_impl.py", line 326, in Run
    ret_objs = fn(self, *args, **kwargs)
  File "/home/cyy/.cache/bazel/_bazel_cyy/00d760fb7a6e7f0044fd9bf53941d102/execroot/spulib/bazel-out/aarch64-opt/bin/examples/python/utils/nodectl.runfiles/spulib/spu/utils/distributed_impl.py", line 589, in builtin_spu_run
    rt.run(spu_exec)
  File "/home/cyy/.cache/bazel/_bazel_cyy/00d760fb7a6e7f0044fd9bf53941d102/execroot/spulib/bazel-out/aarch64-opt/bin/examples/python/utils/nodectl.runfiles/spulib/spu/api.py", line 44, in run
    return self._vm.Run(executable.SerializeToString())
RuntimeError: what: 
        [external/yacl/yacl/link/transport/channel.cc:427] Get data timeout, key=root:P2P-8:0->3
stacktrace: 
#0 yacl::link::transport::Channel::Recv()+0xfffd18090b10
#1 yacl::link::Context::RecvInternal()+0xfffd18085870
#2 yacl::link::Context::Recv()+0xfffd18086b8c
#3 spu::mpc::Communicator::recv<>()+0xfffd175dcf08
#4 spu::mpc::sota4::MulAA::proc()::{lambda()#1}::operator()()::{lambda()#2}::operator()()+0xfffd175f7988
#5 spu::mpc::sota4::MulAA::proc()+0xfffd17606178
#6 spu::mpc::BinaryKernel::evaluate()+0xfffd17f59754
#7 spu::dynDispatch<>()+0xfffd17fd8b54
#8 spu::mpc::tiled<>()+0xfffd17ff7708
#9 spu::mpc::tiledDynDispatch<>()+0xfffd17ff7f8c
#10 spu::mpc::mul_aa()+0xfffd17ff80fc
#11 spu::mpc::mul_ss()+0xfffd17fda260
#12 spu::kernel::hal::_mul_ss()+0xfffd17fc9604
#13 spu::kernel::hal::_mul()+0xfffd17fb8f6c
#14 spu::kernel::hal::i_mul()+0xfffd17f0ff8c
#15 spu::kernel::hal::(anonymous namespace)::dtypeBinaryDispatch<>()+0xfffd17eee324

上面是启动的SPU后台的报错信息,下面是testProto中的报错信息

Traceback (most recent call last):
  File "/home/cyy/.cache/bazel/_bazel_cyy/00d760fb7a6e7f0044fd9bf53941d102/execroot/spulib/bazel-out/aarch64-opt/bin/examples/python/testProto.runfiles/spulib/examples/python/testProto.py", line 73, in <module>
    z = sum_squ(x, y)
  File "/home/cyy/.cache/bazel/_bazel_cyy/00d760fb7a6e7f0044fd9bf53941d102/execroot/spulib/bazel-out/aarch64-opt/bin/examples/python/testProto.runfiles/spulib/spu/utils/distributed_impl.py", line 693, in __call__
    results = [future.result() for future in futures]
  File "/home/cyy/.cache/bazel/_bazel_cyy/00d760fb7a6e7f0044fd9bf53941d102/execroot/spulib/bazel-out/aarch64-opt/bin/examples/python/testProto.runfiles/spulib/spu/utils/distributed_impl.py", line 693, in <listcomp>
    results = [future.result() for future in futures]
  File "/home/cyy/miniforge3/envs/spu/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/home/cyy/miniforge3/envs/spu/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/home/cyy/miniforge3/envs/spu/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/cyy/.cache/bazel/_bazel_cyy/00d760fb7a6e7f0044fd9bf53941d102/execroot/spulib/bazel-out/aarch64-opt/bin/examples/python/testProto.runfiles/spulib/spu/utils/distributed_impl.py", line 247, in run
    return self._call(self._stub.Run, fn, *args, **kwargs)
  File "/home/cyy/.cache/bazel/_bazel_cyy/00d760fb7a6e7f0044fd9bf53941d102/execroot/spulib/bazel-out/aarch64-opt/bin/examples/python/testProto.runfiles/spulib/spu/utils/distributed_impl.py", line 236, in _call
    rsp_data = rebuild_messages(rsp_itr.data for rsp_itr in rsp_gen)
  File "/home/cyy/.cache/bazel/_bazel_cyy/00d760fb7a6e7f0044fd9bf53941d102/execroot/spulib/bazel-out/aarch64-opt/bin/examples/python/testProto.runfiles/spulib/spu/utils/distributed_impl.py", line 214, in rebuild_messages
    return b''.join([msg for msg in msgs])
  File "/home/cyy/.cache/bazel/_bazel_cyy/00d760fb7a6e7f0044fd9bf53941d102/execroot/spulib/bazel-out/aarch64-opt/bin/examples/python/testProto.runfiles/spulib/spu/utils/distributed_impl.py", line 214, in <listcomp>
    return b''.join([msg for msg in msgs])
  File "/home/cyy/.cache/bazel/_bazel_cyy/00d760fb7a6e7f0044fd9bf53941d102/execroot/spulib/bazel-out/aarch64-opt/bin/examples/python/testProto.runfiles/spulib/spu/utils/distributed_impl.py", line 236, in <genexpr>
    rsp_data = rebuild_messages(rsp_itr.data for rsp_itr in rsp_gen)
  File "/home/cyy/miniforge3/envs/spu/lib/python3.10/site-packages/grpc/_channel.py", line 543, in __next__
    return self._next()
  File "/home/cyy/miniforge3/envs/spu/lib/python3.10/site-packages/grpc/_channel.py", line 969, in _next
    raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "Socket closed"
        debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"Socket closed", grpc_status:14, created_time:"2025-01-01T10:21:31.312391086+08:00"}"
>

@deadlywing
Copy link
Contributor

有几个可能性可以自查一下:

  1. 测试数据太大导致超时?(默认好像应该是20 秒左右)
  2. 测试节点没有拉起?
  3. 实现确实有问题?(单测是自己写的还是api test里的)

@c-doubley
Copy link
Author

  1. 测试数据非常简单是 x=1 y=2 计算x*y,应该不是数据太大导致的
  2. 测试节点是否拉起这个不是很理解是什么意思,下面是我的后台启动的提示
INFO: Running command line: bazel-bin/examples/python/utils/nodectl up
[2025-01-01 16:27:23,895] [ForkServerProcess-1] Starting grpc server at 127.0.0.1:61920
[2025-01-01 16:27:23,925] [ForkServerProcess-4] Starting grpc server at 127.0.0.1:61923
[2025-01-01 16:27:23,929] [ForkServerProcess-6] Starting grpc server at 127.0.0.1:61925
[2025-01-01 16:27:23,930] [ForkServerProcess-5] Starting grpc server at 127.0.0.1:61924
[2025-01-01 16:27:23,930] [ForkServerProcess-2] Starting grpc server at 127.0.0.1:61921
[2025-01-01 16:27:23,934] [ForkServerProcess-3] Starting grpc server at 127.0.0.1:61922

下面是配置4pc.json

{
    "id": "outsourcing.4pc",
    "nodes": {
        "node:0": "127.0.0.1:61920",
        "node:1": "127.0.0.1:61921",
        "node:2": "127.0.0.1:61922",
        "node:3": "127.0.0.1:61923",
        "node:4": "127.0.0.1:61924",
        "node:5": "127.0.0.1:61925"
    },
    "devices": {
        "SPU": {
            "kind": "SPU",
            "config": {
                "node_ids": [
                    "node:0",
                    "node:1",
                    "node:2",
                    "node:3"
                ],
                "spu_internal_addrs": [
                    "127.0.0.1:61930",
                    "127.0.0.1:61931",
                    "127.0.0.1:61932",
                    "127.0.0.1:61933"
                ],
                "experimental_data_folder": [
                    "/tmp/spu_data_0/",
                    "/tmp/spu_data_1/",
                    "/tmp/spu_data_2/",
                    "/tmp/spu_data_3/"
                ],
                "runtime_config": {
                    "protocol": "SOTA4",
                    "field": "FM64",
                    "enable_pphlo_profile": true,
                    "enable_hal_profile": true
                },
                "link_desc": {
                    "throttle_window_size": 0,
                    "recv_timeout_ms": 3600000000,
                    "http_timeout_ms": 3600000000
                }
            }
        },
        "P1": {
            "kind": "PYU",
            "config": {
                "node_id": "node:4"
            }
        },
        "P2": {
            "kind": "PYU",
            "config": {
                "node_id": "node:5"
            }
        }
    }
}
  1. 算法实现一步一步都调试过,虽然不能保证一定没问题,但是多次通过了ArithmeticTest,不过因为我暂时只实现了算术算子,转换和布尔算子的部分还没实现,不知道是不是这个原因
    下面是通过的ApiTest
[       OK ] Sota4/ApiTest.not_s/FM32x4 (4 ms)
[ RUN      ] Sota4/ApiTest.not_s/FM64x4
[       OK ] Sota4/ApiTest.not_s/FM64x4 (0 ms)
[ RUN      ] Sota4/ApiTest.not_s/FM128x4
[       OK ] Sota4/ApiTest.not_s/FM128x4 (0 ms)
[ RUN      ] Sota4/ApiTest.not_v/FM32x4
[       OK ] Sota4/ApiTest.not_v/FM32x4 (0 ms)
[ RUN      ] Sota4/ApiTest.not_v/FM64x4
[       OK ] Sota4/ApiTest.not_v/FM64x4 (0 ms)
[ RUN      ] Sota4/ApiTest.not_v/FM128x4
[       OK ] Sota4/ApiTest.not_v/FM128x4 (0 ms)
[ RUN      ] Sota4/ApiTest.not_p/FM32x4
[       OK ] Sota4/ApiTest.not_p/FM32x4 (0 ms)
[ RUN      ] Sota4/ApiTest.not_p/FM64x4
[       OK ] Sota4/ApiTest.not_p/FM64x4 (0 ms)
[ RUN      ] Sota4/ApiTest.not_p/FM128x4
[       OK ] Sota4/ApiTest.not_p/FM128x4 (0 ms)
[ RUN      ] Sota4/ApiTest.msb_v/FM32x4
[       OK ] Sota4/ApiTest.msb_v/FM32x4 (0 ms)
[ RUN      ] Sota4/ApiTest.msb_v/FM64x4
[       OK ] Sota4/ApiTest.msb_v/FM64x4 (0 ms)
[ RUN      ] Sota4/ApiTest.msb_v/FM128x4
[       OK ] Sota4/ApiTest.msb_v/FM128x4 (0 ms)
[ RUN      ] Sota4/ApiTest.msb_p/FM32x4
[       OK ] Sota4/ApiTest.msb_p/FM32x4 (0 ms)
[ RUN      ] Sota4/ApiTest.msb_p/FM64x4
[       OK ] Sota4/ApiTest.msb_p/FM64x4 (0 ms)
[ RUN      ] Sota4/ApiTest.msb_p/FM128x4
[       OK ] Sota4/ApiTest.msb_p/FM128x4 (0 ms)
[ RUN      ] Sota4/ApiTest.lshiftS/FM32x4
[       OK ] Sota4/ApiTest.lshiftS/FM32x4 (0 ms)
[ RUN      ] Sota4/ApiTest.lshiftS/FM64x4
[       OK ] Sota4/ApiTest.lshiftS/FM64x4 (0 ms)
[ RUN      ] Sota4/ApiTest.lshiftS/FM128x4
[       OK ] Sota4/ApiTest.lshiftS/FM128x4 (0 ms)
[ RUN      ] Sota4/ApiTest.lshiftV/FM32x4
[       OK ] Sota4/ApiTest.lshiftV/FM32x4 (0 ms)
[ RUN      ] Sota4/ApiTest.lshiftV/FM64x4
[       OK ] Sota4/ApiTest.lshiftV/FM64x4 (1 ms)
[ RUN      ] Sota4/ApiTest.lshiftV/FM128x4
[       OK ] Sota4/ApiTest.lshiftV/FM128x4 (1 ms)
[ RUN      ] Sota4/ApiTest.lshiftP/FM32x4
[       OK ] Sota4/ApiTest.lshiftP/FM32x4 (0 ms)
[ RUN      ] Sota4/ApiTest.lshiftP/FM64x4
[       OK ] Sota4/ApiTest.lshiftP/FM64x4 (0 ms)
[ RUN      ] Sota4/ApiTest.lshiftP/FM128x4
[       OK ] Sota4/ApiTest.lshiftP/FM128x4 (0 ms)
[ RUN      ] Sota4/ApiTest.rshiftV/FM32x4
[       OK ] Sota4/ApiTest.rshiftV/FM32x4 (0 ms)
[ RUN      ] Sota4/ApiTest.rshiftV/FM64x4
[       OK ] Sota4/ApiTest.rshiftV/FM64x4 (1 ms)
[ RUN      ] Sota4/ApiTest.rshiftV/FM128x4
[       OK ] Sota4/ApiTest.rshiftV/FM128x4 (1 ms)
[ RUN      ] Sota4/ApiTest.rshiftP/FM32x4
[       OK ] Sota4/ApiTest.rshiftP/FM32x4 (0 ms)
[ RUN      ] Sota4/ApiTest.rshiftP/FM64x4
[       OK ] Sota4/ApiTest.rshiftP/FM64x4 (0 ms)
[ RUN      ] Sota4/ApiTest.rshiftP/FM128x4
[       OK ] Sota4/ApiTest.rshiftP/FM128x4 (0 ms)
[ RUN      ] Sota4/ApiTest.arshiftV/FM32x4
[       OK ] Sota4/ApiTest.arshiftV/FM32x4 (0 ms)
[ RUN      ] Sota4/ApiTest.arshiftV/FM64x4
[       OK ] Sota4/ApiTest.arshiftV/FM64x4 (1 ms)
[ RUN      ] Sota4/ApiTest.arshiftV/FM128x4
[       OK ] Sota4/ApiTest.arshiftV/FM128x4 (1 ms)
[ RUN      ] Sota4/ApiTest.arshiftP/FM32x4
[       OK ] Sota4/ApiTest.arshiftP/FM32x4 (0 ms)
[ RUN      ] Sota4/ApiTest.arshiftP/FM64x4
[       OK ] Sota4/ApiTest.arshiftP/FM64x4 (0 ms)
[ RUN      ] Sota4/ApiTest.arshiftP/FM128x4
[       OK ] Sota4/ApiTest.arshiftP/FM128x4 (0 ms)
[ RUN      ] Sota4/ApiTest.TruncS/FM32x4
[       OK ] Sota4/ApiTest.TruncS/FM32x4 (0 ms)
[ RUN      ] Sota4/ApiTest.TruncS/FM64x4
[       OK ] Sota4/ApiTest.TruncS/FM64x4 (0 ms)
[ RUN      ] Sota4/ApiTest.TruncS/FM128x4
[       OK ] Sota4/ApiTest.TruncS/FM128x4 (0 ms)
[       OK ] Sota4/ApiTest.MatMulSS/FM128x4 (28 ms)
[ RUN      ] Sota4/ApiTest.MmulSP/FM32x4
[       OK ] Sota4/ApiTest.MmulSP/FM32x4 (0 ms)
[ RUN      ] Sota4/ApiTest.MmulSP/FM64x4
[       OK ] Sota4/ApiTest.MmulSP/FM64x4 (0 ms)
[ RUN      ] Sota4/ApiTest.MmulSP/FM128x4
[       OK ] Sota4/ApiTest.MmulSP/FM128x4 (0 ms)
[ RUN      ] Sota4/ApiTest.P2S_S2P/FM32x4
[       OK ] Sota4/ApiTest.P2S_S2P/FM32x4 (0 ms)
[ RUN      ] Sota4/ApiTest.P2S_S2P/FM64x4
[       OK ] Sota4/ApiTest.P2S_S2P/FM64x4 (0 ms)
[ RUN      ] Sota4/ApiTest.P2S_S2P/FM128x4
[       OK ] Sota4/ApiTest.P2S_S2P/FM128x4 (0 ms)
[ RUN      ] Sota4/ApiTest.P2V_V2P/FM32x4
[       OK ] Sota4/ApiTest.P2V_V2P/FM32x4 (0 ms)
[ RUN      ] Sota4/ApiTest.P2V_V2P/FM64x4
[       OK ] Sota4/ApiTest.P2V_V2P/FM64x4 (0 ms)
[ RUN      ] Sota4/ApiTest.P2V_V2P/FM128x4

看起来和乘法相关的很多算子都没有通过api测试,下面是ArithmeticTest的结果

[       OK ] Sota4/ArithmeticTest.addAA/FM32x4 (14 ms)
[ RUN      ] Sota4/ArithmeticTest.addAA/FM64x4
[       OK ] Sota4/ArithmeticTest.addAA/FM64x4 (6 ms)
[ RUN      ] Sota4/ArithmeticTest.addAA/FM128x4
[       OK ] Sota4/ArithmeticTest.addAA/FM128x4 (18 ms)
[ RUN      ] Sota4/ArithmeticTest.addAP/FM32x4
[       OK ] Sota4/ArithmeticTest.addAP/FM32x4 (3 ms)
[ RUN      ] Sota4/ArithmeticTest.addAP/FM64x4
[       OK ] Sota4/ArithmeticTest.addAP/FM64x4 (4 ms)
[ RUN      ] Sota4/ArithmeticTest.addAP/FM128x4
[       OK ] Sota4/ArithmeticTest.addAP/FM128x4 (9 ms)
[ RUN      ] Sota4/ArithmeticTest.mulAA/FM32x4
[       OK ] Sota4/ArithmeticTest.mulAA/FM32x4 (57 ms)
[ RUN      ] Sota4/ArithmeticTest.mulAA/FM64x4
[       OK ] Sota4/ArithmeticTest.mulAA/FM64x4 (35 ms)
[ RUN      ] Sota4/ArithmeticTest.mulAA/FM128x4
[       OK ] Sota4/ArithmeticTest.mulAA/FM128x4 (50 ms)
[ RUN      ] Sota4/ArithmeticTest.mulAP/FM32x4
[       OK ] Sota4/ArithmeticTest.mulAP/FM32x4 (6 ms)
[ RUN      ] Sota4/ArithmeticTest.mulAP/FM64x4
[       OK ] Sota4/ArithmeticTest.mulAP/FM64x4 (4 ms)
[ RUN      ] Sota4/ArithmeticTest.mulAP/FM128x4
[       OK ] Sota4/ArithmeticTest.mulAP/FM128x4 (15 ms)
[ RUN      ] Sota4/ArithmeticTest.SquareA/FM32x4
[       OK ] Sota4/ArithmeticTest.SquareA/FM32x4 (1 ms)
[ RUN      ] Sota4/ArithmeticTest.SquareA/FM64x4
[       OK ] Sota4/ArithmeticTest.SquareA/FM64x4 (8 ms)
[ RUN      ] Sota4/ArithmeticTest.SquareA/FM128x4
[       OK ] Sota4/ArithmeticTest.SquareA/FM128x4 (1 ms)
[ RUN      ] Sota4/ArithmeticTest.MulA1B/FM32x4
[       OK ] Sota4/ArithmeticTest.MulA1B/FM32x4 (1 ms)
[ RUN      ] Sota4/ArithmeticTest.MulA1B/FM64x4
[       OK ] Sota4/ArithmeticTest.MulA1B/FM64x4 (1 ms)
[ RUN      ] Sota4/ArithmeticTest.MulA1B/FM128x4
[       OK ] Sota4/ArithmeticTest.MulA1B/FM128x4 (1 ms)
[ RUN      ] Sota4/ArithmeticTest.MulAV/FM32x4
[       OK ] Sota4/ArithmeticTest.MulAV/FM32x4 (4 ms)
[ RUN      ] Sota4/ArithmeticTest.MulAV/FM64x4
[       OK ] Sota4/ArithmeticTest.MulAV/FM64x4 (1 ms)
[ RUN      ] Sota4/ArithmeticTest.MulAV/FM128x4
[       OK ] Sota4/ArithmeticTest.MulAV/FM128x4 (1 ms)
[ RUN      ] Sota4/ArithmeticTest.MulA1BV/FM32x4
[       OK ] Sota4/ArithmeticTest.MulA1BV/FM32x4 (1 ms)
[ RUN      ] Sota4/ArithmeticTest.MulA1BV/FM64x4
[       OK ] Sota4/ArithmeticTest.MulA1BV/FM64x4 (1 ms)
[ RUN      ] Sota4/ArithmeticTest.MulA1BV/FM128x4
[       OK ] Sota4/ArithmeticTest.MulA1BV/FM128x4 (30 ms)
[ RUN      ] Sota4/ArithmeticTest.MatMulAP/FM32x4
[       OK ] Sota4/ArithmeticTest.MatMulAP/FM32x4 (2 ms)
[ RUN      ] Sota4/ArithmeticTest.MatMulAP/FM64x4
[       OK ] Sota4/ArithmeticTest.MatMulAP/FM64x4 (8 ms)
[ RUN      ] Sota4/ArithmeticTest.MatMulAP/FM128x4
[       OK ] Sota4/ArithmeticTest.MatMulAP/FM128x4 (2 ms)
[ RUN      ] Sota4/ArithmeticTest.MatMulAA/FM32x4
[       OK ] Sota4/ArithmeticTest.MatMulAA/FM32x4 (26 ms)
[ RUN      ] Sota4/ArithmeticTest.MatMulAA/FM64x4
[       OK ] Sota4/ArithmeticTest.MatMulAA/FM64x4 (48 ms)
[ RUN      ] Sota4/ArithmeticTest.MatMulAA/FM128x4
[       OK ] Sota4/ArithmeticTest.MatMulAA/FM128x4 (14 ms)
[ RUN      ] Sota4/ArithmeticTest.MatMulAV/FM32x4
[       OK ] Sota4/ArithmeticTest.MatMulAV/FM32x4 (2 ms)
[ RUN      ] Sota4/ArithmeticTest.MatMulAV/FM64x4
[       OK ] Sota4/ArithmeticTest.MatMulAV/FM64x4 (1 ms)
[ RUN      ] Sota4/ArithmeticTest.MatMulAV/FM128x4
[       OK ] Sota4/ArithmeticTest.MatMulAV/FM128x4 (1 ms)
[ RUN      ] Sota4/ArithmeticTest.NotA/FM32x4
[       OK ] Sota4/ArithmeticTest.NotA/FM32x4 (5 ms)
[ RUN      ] Sota4/ArithmeticTest.NotA/FM64x4
[       OK ] Sota4/ArithmeticTest.NotA/FM64x4 (23 ms)
[ RUN      ] Sota4/ArithmeticTest.NotA/FM128x4
[       OK ] Sota4/ArithmeticTest.NotA/FM128x4 (6 ms)
[ RUN      ] Sota4/ArithmeticTest.LShiftA/FM32x4
[       OK ] Sota4/ArithmeticTest.LShiftA/FM32x4 (4 ms)
[ RUN      ] Sota4/ArithmeticTest.LShiftA/FM64x4
[       OK ] Sota4/ArithmeticTest.LShiftA/FM64x4 (14 ms)
[ RUN      ] Sota4/ArithmeticTest.LShiftA/FM128x4
[       OK ] Sota4/ArithmeticTest.LShiftA/FM128x4 (26 ms)
[ RUN      ] Sota4/ArithmeticTest.TruncA/FM32x4
[       OK ] Sota4/ArithmeticTest.TruncA/FM32x4 (4 ms)
[ RUN      ] Sota4/ArithmeticTest.TruncA/FM64x4
[       OK ] Sota4/ArithmeticTest.TruncA/FM64x4 (5 ms)
[ RUN      ] Sota4/ArithmeticTest.TruncA/FM128x4
[       OK ] Sota4/ArithmeticTest.TruncA/FM128x4 (6 ms)
[ RUN      ] Sota4/ArithmeticTest.P2A/FM32x4
[       OK ] Sota4/ArithmeticTest.P2A/FM32x4 (3 ms)
[ RUN      ] Sota4/ArithmeticTest.P2A/FM64x4
[       OK ] Sota4/ArithmeticTest.P2A/FM64x4 (6 ms)
[ RUN      ] Sota4/ArithmeticTest.P2A/FM128x4
[       OK ] Sota4/ArithmeticTest.P2A/FM128x4 (4 ms)
[ RUN      ] Sota4/ArithmeticTest.A2P/FM32x4
[       OK ] Sota4/ArithmeticTest.A2P/FM32x4 (3 ms)
[ RUN      ] Sota4/ArithmeticTest.A2P/FM64x4
[       OK ] Sota4/ArithmeticTest.A2P/FM64x4 (3 ms)
[ RUN      ] Sota4/ArithmeticTest.A2P/FM128x4
[       OK ] Sota4/ArithmeticTest.A2P/FM128x4 (4 ms)

@deadlywing
Copy link
Contributor

确实有点奇怪,,不过看调用栈问题大概率是mulaa实现里的。
你说的BinaryKernel不是和BShare相关的Kernel,是指有两个操作数的Kernel,,

个人建议可以加一点简单的log定位一下是哪个数据没有recv到,是不是哪里数据收发死锁了,,毕竟你这个协议party数比较多,不确定是否每次运行都是稳定的?

@c-doubley
Copy link
Author

之前有过发送接收写错了的问题,这时候ArithmeticTest.mulAA会无法通过测试,所以我简单的以为通过了ArithmeticTest.mulAA测试之后应该没有这种问题了,我再重新检查mulaa实现看是否能找到问题,感谢您的回答

@c-doubley
Copy link
Author

我按照问题[#863 ]中的指示,写了一个哈希的调用,但是出现了一些奇怪的问题
具体的问题:
我在SPU中进行单元测试(自带的)的时候通过了测试ArithmeticTest.mulAA 其中有H(a)=H(a)
但是在我写的example/python目录下新增的testProto.py文件中测试乘法时出现了提示H(a)!=H(a)
为什么会出现单元测试的时候哈希匹配了 在上层应用调用的时候又显示哈希不匹配了?
下面是我写的调用哈希部分的代码
'''
pforeach(0, lhs.numel(), [&](int64_t idx) {
hasher.Update(yacl::ByteContainerView(reinterpret_cast<const char*>(&z01_prime[idx]), sizeof(el_t)));
std::vector<uint8_t> hash_bytes = hasher.CumulativeHash();
std::string hash_str(hash_bytes.begin(), hash_bytes.end());
hash_z01_prime[idx] = std::vector<uint8_t>(hash_str.begin(), hash_str.end());
hasher.Reset();
});
comm->sendAsync<std::vector<uint8_t>>(3, hash_z01_prime, "send_hash_z01_prime");
'''

Copy link

github-actions bot commented Feb 8, 2025

Stale issue message. Please comment to remove stale tag. Otherwise this issue will be closed soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants