Skip to content

Failed internal tests #3

@elvircrn

Description

@elvircrn

Ont his version of the code: #2

and (at least) on the following matrices from https://sparse.tamu.edu/Williams:

  • cant/cant.mtx
  • pdb1HYS/pdb1HYS.mtx

with the flag -D CHECK_RESULT=1, the code produced the following output, noting that the tests have failed:

Input:

./test -d 0 -aat 0 cant/cant.mtx

Output:

--------------------------------!!!!!!!!------------------------------------
device_id = 0
---------------------------------------------------------------
Device [ 0 ] GeForce GTX 1650 Ti @ 1485.00 MHz
MAT: -------------- cant/cant.mtx --------------
input matrix A: ( 62451, 62451 ) nnz = 4007383
 loadfile time    = 0.67493 sec
the tilesize = 16
SpGEMM nnzCub = 269486473
CSR to Tile conversion uses 28.78 ms
tile space overhead = 37.74 MB
step1 ----Calculate the number and tile-column index of tiles of matrixC---
step1 ---------------------- Runtime is  0.37 ms-------------------------

step2 --------Calculate the number of nonzeros of each tile of matrixC-----
step2 ---------------------- Runtime is  4.06 ms-------------------------

step3 ---------Calculate the val&col of nonzeros of matrixC-------------
step3 ---------------------- Runtime is  48.40 ms------------------------

-----------------------Malloc uses 0.71 ms-------------------------------
Non-empty tiles of C = 194910
nnzC = 17440029
CUDA  TileSpGEMM runtime is 53.63 ms, gflops = 10.05
-------------------------------check----------------------------------------
tile to CSR conversion complete!

--------------- SpGEMM (using cuSPARSE) ---------------
 - cuda SpGEMM start! Benchmark runs 1 times.
 - cuda SpGEMM completed!

nnzC = 0, nnzCub = 269486473, Compression rate =  inf
CUDA  cuSPARSE SpGEMM runtime is 1.3550 ms, GFlops = 397.7660
cuSPARSE failed!
---------------------------------------------------------------
---------------------------------------------------------------

Input:

./test -d 0 -aat 0 pdb1HYS/pdb1HYS.mtx

Output:

--------------------------------!!!!!!!!------------------------------------
device_id = 0
---------------------------------------------------------------
Device [ 0 ] GeForce GTX 1650 Ti @ 1485.00 MHz
MAT: -------------- pdb1HYS/pdb1HYS.mtx --------------
input matrix A: ( 36417, 36417 ) nnz = 4344765
 loadfile time    = 0.69516 sec
the tilesize = 16
SpGEMM nnzCub = 555322659
CSR to Tile conversion uses 33.98 ms
tile space overhead = 40.01 MB
step1 ----Calculate the number and tile-column index of tiles of matrixC---
step1 ---------------------- Runtime is  0.34 ms-------------------------

step2 --------Calculate the number of nonzeros of each tile of matrixC-----
step2 ---------------------- Runtime is  6.93 ms-------------------------

step3 ---------Calculate the val&col of nonzeros of matrixC-------------
step3 ---------------------- Runtime is  93.50 ms------------------------

-----------------------Malloc uses 0.95 ms-------------------------------
Non-empty tiles of C = 221571
nnzC = 19594581
CUDA  TileSpGEMM runtime is 101.79 ms, gflops = 10.91
-------------------------------check----------------------------------------
tile to CSR conversion complete!

--------------- SpGEMM (using cuSPARSE) ---------------
 - cuda SpGEMM start! Benchmark runs 1 times.
 - cuda SpGEMM completed!

nnzC = 0, nnzCub = 555322659, Compression rate =  inf
CUDA  cuSPARSE SpGEMM runtime is 1.3250 ms, GFlops = 838.2229
cuSPARSE failed!
---------------------------------------------------------------
---------------------------------------------------------------

However, when run against https://sparse.tamu.edu/SNAP/CollegeMsg,

Input:

./test -d 0 -aat 0 CollegeMsg/CollegeMsg.mtx

Output

--------------------------------!!!!!!!!------------------------------------
device_id = 0
---------------------------------------------------------------
Device [ 0 ] GeForce GTX 1650 Ti @ 1485.00 MHz
MAT: -------------- /home/elvircrn/tug/thesis/repo/matrices/CollegeMsg/CollegeMsg.mtx --------------
input matrix A: ( 1899, 1899 ) nnz = 20296
 loadfile time    = 0.00273 sec
the tilesize = 16
SpGEMM nnzCub = 744395
CSR to Tile conversion uses 1.14 ms
tile space overhead = 0.61 MB
step1 ----Calculate the number and tile-column index of tiles of matrixC---
step1 ---------------------- Runtime is  0.20 ms-------------------------

step2 --------Calculate the number of nonzeros of each tile of matrixC-----
step2 ---------------------- Runtime is  0.90 ms-------------------------

step3 ---------Calculate the val&col of nonzeros of matrixC-------------
step3 ---------------------- Runtime is  3.51 ms------------------------

-----------------------Malloc uses 0.46 ms-------------------------------
Non-empty tiles of C = 14154
nnzC = 407071
CUDA  TileSpGEMM runtime is 5.17 ms, gflops = 0.29
-------------------------------check----------------------------------------
tile to CSR conversion complete!

--------------- SpGEMM (using cuSPARSE) ---------------
 - cuda SpGEMM start! Benchmark runs 1 times.
 - cuda SpGEMM completed!

nnzC = 407071, nnzCub = 744395, Compression rate = 1.83
CUDA  cuSPARSE SpGEMM runtime is 1.7550 ms, GFlops = 0.8483

Validating results...
[PASSED] nnzC = 407071
[PASSED] row_pointer
[PASSED] column_index & value
---------------------------------------------------------------
---------------------------------------------------------------

the code passes it's own tests.

Let me know if more information is necessary. Therefore, I was unable to reproduce the results from the paper given this setup. Please let me know if I have made an error at some point.

Thanks,
Elvir

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions