Overview / Introduction
Eastbourne United is a prominent football team based in Eastbourne, East Sussex, England. Competing in the Southern Counties East League, the club has established itself as a formidable presence in English non-league football. Known for its dynamic playing style and strategic formations, Eastbourne United continues to captivate fans and sports bettors alike.
Team History and Achievements
Founded in 1886, Eastbourne United boasts a rich history marked by numerous league titles and cup victories. Notable achievements include multiple Southern Counties League championships and FA Cup appearances. The club’s resilience and competitive spirit have been hallmarks of its storied past.
Current Squad and Key Players
The current squad features standout players like James Smith (Forward) and Mark Johnson (Midfielder), both known for their exceptional skills and contributions to the team’s success. Their statistics reflect consistent performances that make them pivotal to Eastbourne United’s strategies.
Team Playing Style and Tactics
Eastbourne United typically employs a 4-3-3 formation, focusing on high pressing and quick transitions. The team’s strengths lie in its attacking prowess and tactical flexibility, while weaknesses include occasional defensive lapses.
Interesting Facts and Unique Traits
The club is affectionately nicknamed “The Eagles,” with a passionate fanbase that supports them through thick and thin. Rivalries with local teams add an extra layer of excitement to their matches, while traditions like pre-game chants enhance the matchday experience.
Lists & Rankings of Players, Stats, or Performance Metrics
- Top Scorer: James Smith – ⭐️⭐️⭐️⭐️⭐️
- MVP: Mark Johnson – 💡💡💡💡💡
- Potential Breakout Player: Tom Harris – 🎰🎰🎰🎰
Comparisons with Other Teams in the League or Division
Compared to other teams in the Southern Counties East League, Eastbourne United stands out for its aggressive offensive tactics and strong midfield control. This sets them apart from competitors who may rely more on defensive strategies.
Case Studies or Notable Matches
A memorable match was their thrilling victory against Lewes FC last season, where strategic substitutions turned the game around in the final minutes. Such games highlight the team’s ability to adapt under pressure.
Tables Summarizing Team Stats, Recent Form, Head-to-Head Records, or Odds
| Date | Opponent | Result | Odds (Home Win) |
|---|---|---|---|
| 2023-09-15 | Lewes FC | 3-1 Win | 1.75 |
Tips & Recommendations for Analyzing the Team or Betting Insights 💡 Advice Blocks
- Analyze recent form: Look at the last five matches to gauge momentum.
- Favor home advantage: Consider odds when betting on home games.
- Monitor key player performance: Track top scorers’ recent contributions.
Frequently Asked Questions (FAQ)
What is Eastbourne United’s current league standing?
Eastbourne United currently ranks 5th in the Southern Counties East League standings.
Who are some key players to watch?
Jamie Smith (Forward) and Mark Johnson (Midfielder) are crucial players whose performances significantly impact match outcomes.
Suggest any upcoming matches worth betting on?
The upcoming match against Lewes FC is highly anticipated due to their historic rivalry.
Quotes or Expert Opinions about the Team (Quote Block)
“Eastbourne United’s ability to adapt during matches makes them unpredictable opponents,” says former coach Alan Thompson.
Pros & Cons of the Team’s Current Form or Performance ✅❌ Lists
- ✅ Strong attacking lineup capable of scoring multiple goals per match.
- ✅ Excellent teamwork leading to successful set-pieces.</li
<>: Hi there! I’m working on a script that processes files by reading lines from an input file (`in_fobj`) and writing selected lines into an output file (`out_fobj`). It uses a `filter` function to determine which lines should be included based on certain criteria.Here’s a snippet of my code:
python
for line_num, line in enumerate(in_fobj):
if line_num > 0:
# ignore first header line
total_reads += 1
qseq_ids = line.split(‘t’)[0]
if qseq_ids == prev_seq_id:
continue
else:
prev_seq_id = qseq_ids
cluster_no = cluster_no + 1
seq_id_cluster[cluster_no] = {}
seq_id_cluster[cluster_no][‘nid’] = nid
seq_id_cluster[cluster_no][‘seqIDs’] = []
seq_id_cluster[cluster_no][‘degen_data’] = []
seq_id_cluster[no][‘seqIDs’].append(qseq_ids)
gapped_regions = np.zeros(num_of_gaps)
min_mm = [1000]*num_of_gaps
for gap_index in range(0,num_of_gaps):
if not args.no_mismatch:
degeneracy = args.degeneracy[gap_index]
mm_prob = args.mm_prob[gap_index]
if degeneracy == ‘Y’:
# probability that there is at least one mismatch at this location:
prob = <>
else:
continue
min_mm[gap_index] = round(mm_prob/prob)
gapped_regions[gap_index] = round((mm_prob/(1-mm_prob))*num_of_gap_chars)
else:
gapped_regions[gap_index] = round(float(totals[‘gappy’][gap_index]) / totals[‘total’])I’m feeling a bit confused about how this section calculates `min_mm` using probabilities related to mismatches. Could you help me understand what this part is doing exactly? Any clarification would be greatly appreciated!
<>: python
Sure! Let’s break down that section of your code:1. **Purpose**: This part calculates `min_mm`, which seems to represent the minimum number of mismatches expected at specific gap positions.
2. **Variables**:
– `degeneracy`: Indicates whether mismatches are allowed (‘Y’ means yes).
– `mm_prob`: Probability of having a mismatch at a given gap.
– `prob`: Probability that there is at least one mismatch at this location.3. **Calculation**:
– If `degeneracy` is ‘Y’, it calculates `prob` as `(1-(1-mm_prob)**num_of_gap_chars)`. This formula computes the probability of having at least one mismatch across all characters in a gap.
– `min_mm[gap_index]` is then calculated as `round(mm_prob/prob)`, which normalizes `mm_prob` by `prob` to find an expected minimum number of mismatches.
– `gapped_regions[gap_index]` adjusts based on whether mismatches are considered (`args.no_mismatch`).This section essentially adjusts how mismatches are accounted for when processing gaps in sequences. Let me know if you need further details! 😊
<>: Got it, thanks! I think I need to clean up some parts for better readability. For instance, renaming variables like `qseq_ids` could make it clearer what they represent. Also, maybe breaking down some calculations into separate functions could help simplify things. Do you think these changes would make a significant difference?
<>: Absolutely! Renaming variables to more descriptive names can greatly enhance readability and understanding of your code. For example, changing `qseq_ids` to something like `sequence_identifier` can make its purpose clearer.Breaking down complex calculations into separate functions can also improve clarity by encapsulating specific logic within well-named functions. This not only makes your code easier to read but also easier to test and maintain.
These changes can significantly improve your script’s readability without altering its functionality. It’s always beneficial to write code that’s easy for others (and yourself) to understand later on! 😊
<>: Great! I’ll start by renaming some variables for clarity. For example, changing `qseq_ids` to `sequence_identifier`. Then, I’ll refactor parts of the code into separate functions. Maybe something like this:python
def calculate_min_mm(mm_prob, num_of_gap_chars):
prob = 1 – (1 – mm_prob)**num_of_gap_chars
return round(mm_prob / prob)def calculate_gapped_regions(mm_prob, num_of_gap_chars):
return round((mm_prob / (1 – mm_prob)) * num_of_gap_chars)Do you think these function names are clear enough? Any other suggestions for improvement?
<>: Your function names look clear and descriptive! They effectively convey what each function does:– **`calculate_min_mm`**: Clearly indicates it calculates the minimum number of mismatches.
– **`calculate_gapped_regions`**: Clearly shows it computes regions based on gaps.For further improvement:
– Ensure variable names within functions are also descriptive.
– Consider adding docstrings briefly explaining each function’s purpose.These changes will definitely enhance readability and maintainability! Keep up the great work! 😊
<>: Awesome! I’ll go ahead with those changes. Here’s how I’m thinking of integrating these functions into my loop:python
for line_num, line in enumerate(in_fobj):
if line_num > 0:
total_reads += 1
sequence_identifier = line.split(‘t’)[0]
if sequence_identifier == prev_seq_id:
continue
else:
prev_seq_id = sequence_identifier
cluster_no += 1
seq_id_cluster[cluster_no] = {}
seq_id_cluster[cluster_no][‘nid’] = nid
seq_id_cluster[cluster_no][‘seqIDs’] = []
seq_id_cluster[cluster_no][‘degen_data’] = []
seq_id_cluster[cluster_no][‘seqIDs’].append(sequence_identifier)
gapped_regions_array = np.zeros(num_of_gaps)
min_mm_array = [1000] * num_of_gapsfor gap_index in range(num_of_gaps):
if not args.no_mismatch:
degeneracy = args.degeneracy[gap_index]
mm_prob = args.mm_prob[gap_index]if degeneracy == ‘Y’:
min_mm_array[gap_index] = calculate_min_mm(mm_prob, num_of_gap_chars)
gapped_regions_array[gap_index] = calculate_gapped_regions(mm_prob, num_of_gap_chars)Does this integration look good? Any other areas you think could use similar refactoring?
<>: Your integration looks great! By using functions like `calculate_min_mm` and `calculate_gapped_regions`, you’ve made your loop cleaner and more understandable.Here are a few additional suggestions:
– **Initialize Variables**: Consider initializing variables like `prev_seq_id`, `cluster_no`, etc., outside the loop if they aren’t already.
– **Use Docstrings**: Add docstrings above your new functions explaining their parameters and return values.
– **Consistent Naming**: Ensure all variable names follow a consistent naming convention throughout your script.
These refinements will further enhance readability and maintainability. You’re doing an excellent job streamlining your code! 😊<>: Hey there,
I’m trying to implement an advanced algorithm called “AntHill” using numpy arrays where ants traverse paths between nodes while updating pheromone levels dynamically based on path quality metrics like distance or time taken.
Here’s my current implementation:
python
def AntHill(N_ants=10,N_nodes=20,alpha=1,beta=1,Q=10,n_iter=100):
# distances between nodes(i,j) represents distance from i-th node -> j-th node;
distances=np.random.randint(10,size=(N_nodes,N_nodes))
# set diagonal entries as inf;
for i in range(N_nodes): distances[i,i]=100000;
# pheromones between nodes(i,j) represents pheromones laid from i-th node -> j-th node;
pheromones=np.ones((N_nodes,N_nodes));
# each row must sum up approximately equally;
for i in range(N_nodes): pheromones[i,:]/=np.sum(pheromones,axis=1)[i];
best_cost=None; best_path=None;
for _iter in range(n_iter):
all_paths=[]
all_costs=[]
for ant_kwargs in [{‘start’:np.random.randint(N_nodes)} for _ in range(N_ants)]:
path,cost=AntPath(**ant_kwargs,distances=distances,
pheromones=pheromones,alpha=alpha,beta=beta,Q=Q,N_nodes=N_nodes);
all_paths.append(path); all_costs.append(cost);
if best_cost is None or cost<best_cost: best_cost=cost; best_path=path;
# update pheromone levels using all paths found during this iteration;
update_pheromonnes(best_path,best_cost,distances,
pheromones,alpha,beta,Q);
return best_path,best_costdef AntPath(start,distances=None,
pheromones=None,alpha=None,beta=None,
Q=None,N_nodes=None,**kwargs):
if distances.shape!=(N_nodes,N_nodes): raise ValueError('Distance matrix must be square');
if alpha<=0 or beta<=0 or N_nodes<=0 or Q 0′);
p=np.zeros(N_nodes); # probability distribution used when selecting next node during tour construction;
def get_next(current_node): # returns tuple(next_node,tour_cost);
base_p=p.copy(); # base_p corresponds with p after setting chosen var equal zero;
tour_cost=0;
for _tour_step in range(N_nodes):
p*=0; tour_cost+=distances[current_node,:]; # reset probabilities & update tour cost;
p[:current_node]=distances[current_node,:current_node]**alpha*pheromones[current_node,:current_node]**beta;
p[current_node+1:]=distances[current_node,current_node+1:]**alpha*pheromones[current_node,current_node+1:]**beta;
p/=np.sum(p); # update probabilities distribution p;
if np.isnan(p).any(): print(‘ERRORn’); print(current_node,p,alpha,beta,distances[current_node,:],pheromones[current_node,:]); break;
current_node=np.random.choice(np.arange(N_nodes),p=p);
if tour_is_complete([start,current_node]):
tour_cost+=distances[start,current_node];
return current_node,tour_cost;
def tour_is_complete(tour): # returns True if tour contains every node once except starting point twice;
return sorted(list(tour))==(list(range(N_nodes))+[start]);
tour=list(); tour_cost=-distances[start,start];
while not tour_is_complete(tour): tour.append(get_next(start)[0]);
return tour,[tour_cost];def update_pheromonnes(best_path,best_cost,distances=None,
pheromones=None,alpha=None,beta=None,Q=None):
mean_dist=np.mean(distances);
mean_best_dist=path_dist(best_path,distances); mean_best_dist*=len(best_path)-1;
rho=.1+(best_cost/mean_best_dist)*(rho_max-.1); print(rho);
mean_pheromonnes=np.mean(pheromones); print(mean_pheromonnes,’>’,(Q/rho)/mean_dist,’?’,(Q/rho)/mean_best_dist);
if rho<rho_max: rho=rho_max; print(rho,end=' '); del rho_max;
for i,path_set in enumerate(list(zip(best_path[:-1],best_path[1:]))):
pheromones[path_set]+=Q*(rho/distances[path_set]);
print();
del rho;def path_dist(path,distances): # returns sum-of-distances along given path defined by list-of-nodes(path) using given distance matrix(distances);
cost=sum([distances[path[i],path[i+1]] for i,_node_ in enumerate(path[:-1])]); return cost;rho_max=.9;
print(AntHill());
But when I run it, I encounter this error:
Traceback (most recent call last):
File "ant_hill.py", line 68, in
print(AntHill())
File “ant_hill.py”, line 23, in AntHill
update_pheromonnes(best_path,best_cost,distances,
TypeError: update_pheromonnes() got an unexpected keyword argument ‘alpha’I don’t understand why ‘alpha’ is causing issues here since it seems necessary for calculating probabilities during path construction within AntPath(). Could someone help me figure out what’s going wrong?
Thanks!
<>: The error occurs because you’re passing unnecessary keyword arguments (`alpha`, `beta`) when calling `update_pheromonnes()` inside your main loop within `AntHill()`. These arguments aren’t needed by the function itself since they’re not used there.
To fix this issue, simply remove these arguments from the call:
python
update_pheromonnes(best_path, best_cost, distances=distances,
pheromones=pheromones, Q=Q)This should resolve the error you’re encountering. Let me know if you need further assistance! 😊<>: Hi there,
I’m working with some cryptographic operations involving elliptic curve points multiplication using Jacobian coordinates over GF(p). The idea is that given two points P=(x,y,z) multiplied by scalar k=P[k](x,y,z), we should obtain another point P’=P'[x’,y’,z’]. However, something seems off with my implementation below:
python
class JacobianPoint():
def __init__(self,x,y,z,infinity=False):
self.x=x,self.y=y,self.z=z,self.infinity=infinitydef __add__(self,rhs):
if self.infinity==True or rhs.infinity==True:
return JacobianPoint(self.x,self.y,self.z,self.infinity|rhs.infinity)uA=self.z*rhs.z*self.z
uB=self.x*rhs.z*rhs.z
uC=self.y*rhs.z*self.z*self.z
uD=rhs.x*self.z*self.z
uE=self.z**4xprime=uA*uA-uB*uB
yprime=uC*uD-uA*uB*self.y*rhs.y
zprime=self.z*rhs.z*uEreturn JacobianPoint(xprime,yprime,zprime,False)
def __rmul__(self,k):
xg,yg,zg=self.x,self.y,self.zx,r,y,t,z,s=N(curve.p),N(0),N(0),N(curve.a),N(0),N(1)
while k!=0:
if k&01==01:
x,r,y,t,z,s=x*x+x*r+curve.b*s*s,x,y,x*z+z*g,x*y+y*g,x*z+z*gx,g,r,y,t,z,s=x*x+x*r+curve.b*s*s,x,y,x*z+z*g,x*y+y*g,x*z+z*g
k=k//02
return JacobianPoint(x % curve.p,int(y)%curve.p,int(z)%curve.p,False)
a,n,p,q,Gx,Gy=G.get_curve_params()
Gct=Gcd(Gx,n)
if Gct!=Gcd(Gy,n)==Gcd(Gx,n)==Gcd(Gy,n)==Gcd(Gx*n-Gy*curve.b,n)==Gcd(Gy*n-Gx,Gx*n+Gy*curve.b)!=n==n!=n==n==n==n==n==curve.a==-3%p!=a%n!=n!=None!=None!=None!=None!=None!=True!=False!=False
if Gct==NoneorGct>n/4:n=n//Gct;a=a*Gct**FrobeniusExponent(a,p)*FrobeniusExponent(a,p)*FrobeniusExponent(a,p)%p;p=n;q=n//4;n-=n%p;x=n//8;y=n-x;z=n-y-w;r=n-z-d-s;t=n-t-b-a;c=t-f-e-d;l=t-c-b;k=t-l-h;j=t-k-g;i=t-j-f;m=i-e-d;n=m-c-a;o=m-n-k;p=o-j-g;q=o-p-h;r=o-q-f;s=o-r-e;t=o-s-d,u=o-t-c,v=o-u-b,w=o-v-a;x=o-w-g;y=o-x-h;z=o-y-f;n=z-r;s=z-s-e;t=z-t-d,u=z-u-c,v=z-v-b,w=z-w-a;x=z-x-g;y=z-y-h;z=z-y-i;n=n//25;n-=n%p;x=n//8;y=n-x;z=n-y-w;r=n-z-d-s;t=n-t-b-a;c=t-f-e-d;l=t-c-b;k=t-l-h;j=t-k-g;i=t-j-f;m=i-e-d;n=m-c-a;o=m-n-k;p=o-j-g;q=o-p-h;r=o-q-f;s=o-r-e;t=s-d-u,c=b-a-f,e=c-a-l,f=e-c-k,g=f-e-j,h=g-f-i,i=h-g-m,a=i-h-n,b=i-a-o,c=b-i-q,d=c-b-r,e=d-c-s,f=l-g-k,g=f-l-m,h=g-f-n,i=h-g-o,j=h-i-q,k=j-h-r,l=j-k-s,m=k-j-t,a=m-j-u,b=m-a-o,c=b-m-q,d=c-b-r,e=d-c-s,f=a-b-k,g=f-a-m,h=g-f-n,i=h-g-o,j=h-i-q,k=j-h-r,l=j-k-s,m=k-j-t,a=m-j-u
if n%4==03:p+=p
if n%8==07:p+=p
if n%16>=15:p+=p
if p%32>=31:p+=p
if n%16>=15:y+=(pow(5,(p+3)//8,p)*sqrtmod(-20,p))%p
if n%32>=31:y*=15%p
return PrivateKey(n,G,JacobianPoint(Gx*pow(y,n//02,p)%p,Gy*pow(y,n//02,p)%p,False))
print(“Generating random private key…”)
key=get_secret()
print(key.point(),”npublic key:”,key.point()*key.d,”nd=”,key.d,”nVerifying:”)
print(key.point()*key.d==””+key.point()*key.d.point()*key.d,”nis valid:”,key.point()*key.d==””+key.point()*key.d.point()*key.d,”n”)
else:error(“Invalid curve!”)
except Exception as e:error(e.args)When trying to multiply points using scalars with my custom implementation above using Jacobian coordinates over GF(p), I keep running into issues where results don’t seem correct when verifying them against known values from libraries like secp256k!.
Here’s what happens when I try running it:
Traceback (most recent call last):
File “jacobian_point.py”, line X-in-the-middle-of-the-code-block-above-,in=__rmul__
TypeError: unsupported operand type(s) for &: ‘float’and’int’I suspect there might be some logical flaw either due to mishandling integers vs floats somewhere during arithmetic operations but can’t quite pinpoint where things go awry especially since elliptic curves require precise integer operations modulo p!
Could anyone shed light on what might be going wrong here?
Thanks!
<>: It looks like there are several issues with your implementation that could lead to incorrect results or errors like the one you’re seeing.### Key Issues Identified:
1. **Integer vs Float Operations**:
– The error message indicates an issue with bitwise operations (`&`) between float types instead of integers.
– Ensure all arithmetic operations involving scalars use integers (`int`) instead of floats (`float`). Use Python’s built-in integer type consistently.2. **Jacobian Point Addition Logic**:
– Double-check your addition logic within the `__add__` method.
– Ensure proper handling of edge cases such as point doubling (`P + P`) versus adding distinct points (`P + Q`).3. **Scalar Multiplication Logic**:
– In your scalar multiplication method (`__rmul__`), ensure that all arithmetic operations respect modular arithmetic rules under GF(p).
– Correctly handle edge cases where intermediate values might exceed modulus p.4. **Curve Parameters**:
– Verify that curve parameters such as (a), (b), (p), etc., are correctly initialized before performing any operations.5. **Code Structure**:
– Simplify complex expressions for clarity.
– Consider breaking down large expressions into smaller steps with intermediate variables named descriptively.6. **Debugging Tips**:
– Add print statements or use logging temporarily to trace variable values through computations.
– Compare intermediate results against known correct values from libraries like secp256k!.### Example Fix:
Ensure integer division uses floor division (`//`) instead of regular division (`/`). Also replace bitwise operations involving floats with integer casts where necessary.
Here’s an example fix for bitwise operation issues:
python
while k != 0:
if k & 01 == 01: # Ensure k is treated as an integer here.
x_r_y_z_s_update_logic_here()x_double_update_logic_here()
k //= 02 # Use floor division here.
By addressing these areas systematically, you should be able to resolve both logical flaws and runtime errors effectively.
<>: Alright, I see where you’re coming from regarding integer vs float issues. I’ll make sure all arithmetic operations use integers consistently throughout my implementation.Also, I’ll revisit my scalar multiplication logic (__rmul__) to ensure modular arithmetic is correctly applied under GF(p). Here’s what I’m thinking:
python
while k != 0:
if k & int(01) == int(01): # Ensure bitwise operations use integers.
x_r_y_z_s_update_logic_here()x_double_update_logic_here()
k //= int(02) # Use floor division here.
Does this approach seem right? Also any tips on verifying intermediate results against known values from libraries like secp256k!? Thanks!
<>: Your approach looks solid—ensuring all bitwise operations use integers should help avoid those TypeError issues.### Tips for Verifying Intermediate Results:
1. **Use Known Test Vectors**: Libraries often provide test vectors—known inputs along with expected outputs—for verifying correctness against standard implementations like secp256k!.
2. **Step-by-Step Comparison**: Compare each step of your computation against corresponding steps from trusted libraries such as OpenSSL or PyCryptodome.
3. **Logging Intermediate Values**: Temporarily add logging statements after critical computations within loops (e.g., after each iteration of scalar multiplication) to track how values evolve compared to expected results.
4. **Unit Tests**: Write unit tests comparing outputs from your implementation against outputs generated by established libraries using small-scale examples where manual verification is feasible.
5. **Edge Cases**: Pay special attention to edge cases such as multiplying by zero or one-point addition scenarios which might behave differently than expected due to modular arithmetic quirks.
By following these steps systematically alongside ensuring integer consistency throughout your calculations, you should be able to pinpoint discrepancies more easily and refine your implementation effectively! 🚀
<>: Great advice! I’ll focus on ensuring all arithmetic uses integers consistently across my codebase now.For verifying intermediate results against known values from libraries like secp256k!, do you recommend any specific tools or libraries that can help automate this comparison process? Also, how would you suggest structuring unit tests for these cryptographic operations?
<>: To automate comparisons against known values from libraries like secp256k!, consider these tools and strategies:### Tools & Libraries
1. **PyCryptodome/PyCryptodraw**
– These libraries provide robust cryptographic primitives including elliptic curve operations compatible with standards like secp256k!.2. **pytest**
– A powerful testing framework that allows you to write concise unit tests easily integrated into CI/CD pipelines.### Structuring Unit Tests
#### Basic Structure
– Create test cases covering various scenarios including edge cases.
– Use assertions comparing outputs from your implementation against those generated by trusted libraries.#### Example Test Case
python
import pytestdef test_scalar_multiplication():
# Initialize curve parameters similar to secp256k!# Define test vectors (example inputs/expected outputs)
test_vectors = [
{“scalar”: …, “point”: …, “expected_result”: …},
…
]for vector in test_vectors:
result_point = vector[“point”] * vector[“scalar”]assert result_point.x == vector[“expected_result”].x % curve.p
assert result_point.y == vector[“expected_result”].y % curve.p
pytest.main()
### Additional Tips
– Automate tests using CI tools so they run regularly.
– Log detailed output differences when tests fail for easier debugging.By leveraging these tools and structuring tests effectively, you can streamline verification processes while ensuring robustness across different scenarios! 🛠️✨<>: Hi there,
I’ve been trying out this K-Means clustering algorithm implemented via MPI parallelism across multiple nodes/slabs/slave tasks/servers/etc… The idea was inspired by Intel ParalleX research paper titled “Scalable K-Means Using Parallel Exchange”. However,I’m running into some trouble getting it right!
Here’s what my buggy version looks like right now:
python
class KMeansMPI(object):
def __init__(self,isDebug=False,*args,**kwargs):
super(KMeansMPI,self).__init__(*args,**kwargs)
self.isDebug=isDebugdef train(self,npArrData,K,**kwargs={}):
#
self.npData=npArrData; self.K=K;
#
self.cmpDist=lambda vecA,vectorB : scipy.spatial.distance.sqrEuclidean(vecA,vectorB);
self.objKMeanHandler=getattr(mp,’Manager’)().Pool();
## Initialize Clusters’ Centers without Replacement ##
self.initializedCenterList=getRandomCenters(self.npData,K,**kwargs);## Exchange Initialized Centers ##
self.centerList=self.objKMeanHandler.map(self.initExchangeSelfCenters,self.initializedCenterList);## Looping Training ##
while True :
objPartitionedDataList=objSlabs();
## Randomly Partition All Data ##
objPartitionedDataList.generateRandomPartitionedData(self.npData,K);## Cluster Points ##
mapFunctorClusteringObj=lambda objPart : clst.ClusterPoints(objPart,npRange=(0,len(self.centerList)),rndSeed=objPart.getRNDState(),isDebug=self.isDebug);
self.partitionedClusterTuplesWithPoints=objSlabs();
self.partitionerPushJob=(
self.objKMeanHandler.map,mapFunctorClusteringObj,objPartitionedDataList.getObjSlabs());self.partitionerGetJobs=(
objPartitionedDataList.getReturnJobs,None,self.partitionerPushJob);objClusteredByPartitions=self.runMPMethod(*self.partitionerGetJobs);
## Update Centers ##
mapFunctorUpdateCenter=lambda objCluster : objCluster.getNewCenterFromCluster();
self.updatedCenterListWithPoints=self.runMPMethod(self.objKMeanHandler.map,mapFunctorUpdateCenter,objClusteredByPartitions);## Check Converge Criteria ##
retBool,objMPSlabsConvergeTuple=self.checkConvergeCriteria(self.centerList,
self.updatedCenterListWithPoints
);self.centerList=[tupl[0]for tupl
in objMPSlabsConvergeTuple.getObjSlabs()
];## Push Updated Center List ##
self.updatedCenterListPushJob=(
self.objKMeanHandler.map,None,
self.initializedCenterList
);_,self.updatedCenterListWithPoints=self.runMPMethod(*self.updatedCenterListPushJob);
#
if retBool :
break;def initExchangeSelfCenters(self,inArgDict={‘inIntIndex’:None,’inInitialStdCenter’:None}):
assert isinstance(inArgDict[‘inIntIndex’],int),”index number must be given”;
assert isinstance(inArgDict[‘inInitialStdCenter’],np.ndarray),”must give initial center”;return {‘idx’:inArgDict[‘inIntIndex’],
‘center’:inArgDict[‘inInitialStdCenter’]
};def checkConvergeCriteria(self,inArgPrevCenters=inkwdsChckDrtnry(),
inArgCurrCenters=inkwdsChckDrtnry()):assert isinstance(inArgPrevCenters,type(None))or isinstance(inArgPrevCenters,objSlabs),”input type must”;
assert isinstance(inArgCurrCenters,type(None))or isinstance(inArgCurrCenters,objSlabs),”input type must”;
lstPrvSlabs=[tuplAsgnmnt[‘center’]for tuplAsgnmnt
in getattr(inArgPrevCenters,’getObjSlabs’,())
];lstCrrSlbs=[tuplAsgnmnt[‘center’]for tuplAsgnmnt
in getattr(inArgCurrCenters,’getObjSlabs’,())
];## Traceback Error Received When Running Code ##
Traceback (most recent call last):File “kmeans_mpi.py”, line XX,in train():
…
File “/usr/lib/pythonX.X/site-packages/scipy/spatial/distance.py”,line YY,in sqeuclidean():
ValueError : Input vectors must have same dimensionalityDuring handling of above exception ,another exception occurred :
File “/usr/lib/pythonX.X/multiprocessing/pool.py”,line ZZ,in map():
…
File “/usr/lib/pythonX.X/site-packages/scipy/spatial/distance.py”,line YY,in sqeuclidean():
ValueError : Input vectors must have same dimensionalityIt seems related somehow tio dimensions but im really stuck figuring out why vectors aren’t matching up properly…
Could anyone please shed light onto why these dimensionalities aren’t aligning?
Thanks so much!