rand 與 rand_r的差異

ss
8 min readSep 24, 2018

--

最近在寫學校平行化的作業,看到第一關 ,用pthread 算出pi 的值, 採用蒙地卡羅方法去算, 想說輕輕鬆鬆 ,不過就分四條 thread 去算就行了, 但是卻發現神奇的事, 一條thread竟然比兩條甚至四條還快, 我還在想是不是搞錯什麼了,花了很長的時間debug, 結果發現問題竟然出現在這邊, C library, rand(), 更改rand_r()卻正常了, 但是這邊卡關的點我想還是得弄清楚

先看到rand 的source 是怎麼去call的

[來源]

/* Copyright © 1991, 1996 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
02111–1307 USA. */
#include <stdlib.h>#undef rand/* Return a random integer between 0 and RAND_MAX. */
int
rand ()
{
return (int) __random ();
}

可以看到 它去call這個 **__random()** , 接著我們在看發生什麼事

**undef 會移除先前對rand的任何定義**

接著我們往裡面繼續看
[random.c]


/* If we are using the trivial TYPE_0 R.N.G., just do the old linear
congruential bit. Otherwise, we do our fancy trinomial stuff, which is the
same in all the other cases due to all the global variables that have been
set up. The basic operation is to add the number at the rear pointer into
the one at the front pointer. Then both pointers are advanced to the next
location cyclically in the table. The value returned is the sum generated,
reduced to 31 bits by throwing away the “least random” low bit.
Note: The code takes advantage of the fact that both the front and
rear pointers can’t wrap on the same call by not testing the rear
pointer if the front one has wrapped. Returns a 31-bit random number. */
long int
__random ()
{
int32_t retval;
__libc_lock_lock (lock);(void) __random_r (&unsafe_state, &retval);__libc_lock_unlock (lock);return retval;
}

由此我們找到問題所在了, 這邊有一個lock預防 concurrent call 去修改到共用的data
可看initialized的說明

/* POSIX.1c requires that there is mutual exclusion for the `rand’ and
`srand’ functions to prevent concurrent calls from modifying common
data. */
__libc_lock_define_initialized (static, lock)

本意應該是不希望thread concurrent 換別人做時動用到共用data區, 但是在mutli thread 由於每個thread可跑在不同的core上, 所以這項好意在這邊卻妨礙了平行化, 而我們也可以看到他是去call __rand_r 去幫助get random number, 這也是為什麼我們的平行化會比一條thread還慢,因為這邊一旦有一個人卡住沒有執行完 , 其他人也別想去拿到這個值, 除了不能平行化, 可能還會多出等待的時間 , 反而效率下降

接著我們看rand_r做了什麼事

[random_r.c]

/* If we are using the trivial TYPE_0 R.N.G., just do the old linear
congruential bit. Otherwise, we do our fancy trinomial stuff, which is the
same in all the other cases due to all the global variables that have been
set up. The basic operation is to add the number at the rear pointer into
the one at the front pointer. Then both pointers are advanced to the next
location cyclically in the table. The value returned is the sum generated,
reduced to 31 bits by throwing away the “least random” low bit.
Note: The code takes advantage of the fact that both the front and
rear pointers can’t wrap on the same call by not testing the rear
pointer if the front one has wrapped. Returns a 31-bit random number. */
int
__random_r (buf, result)
struct random_data *buf;
int32_t *result;
{
int32_t *state;
if (buf == NULL || result == NULL)
goto fail;
state = buf->state;if (buf->rand_type == TYPE_0)
{
int32_t val = state[0];
val = ((state[0] * 1103515245) + 12345) & 0x7fffffff;
state[0] = val;
*result = val;
}
else
{
int32_t *fptr = buf->fptr;
int32_t *rptr = buf->rptr;
int32_t *end_ptr = buf->end_ptr;
int32_t val;
val = *fptr += *rptr;
/* Chucking least random bit. */
*result = (val >> 1) & 0x7fffffff;
++fptr;
if (fptr >= end_ptr)
{
fptr = state;
++rptr;
}
else
{
++rptr;
if (rptr >= end_ptr)
rptr = state;
}
buf->fptr = fptr;
buf->rptr = rptr;
}
return 0;
fail:
__set_errno (EINVAL);
return -1;
}

這邊就是主要的運算式了,基本上輸入會有兩個值, buf 與 result, result 就是結果, 而buf則是輸入的一種seed, rand()會挑一個默認值,每次call會改變他的默認值,這也是為什麼它用lock去保護他的原因,避免兩個thread取到相同的值

這邊對seed就不做太多的介紹, 主要上我們再做隨機取值是無法作到真隨機的, 所以我們會加入一個seed去左右它隨機出來的數字,當然seed一樣出來的數字也就一樣,通常是用時間來當seed,因為時間會一直做變化,一直走下去

以上就是本次lab卡關分析,建議要跑多thread的話可以直接call rand_r

有關lab的作業都在[github](https://github.com/kweisamx/Pararell_Programming/tree/master/lab1)

若觀念上有錯誤或是有任何想幫忙補充的也都歡迎提出,謝謝

--

--

ss
ss

No responses yet