Calculating Prime Numbers (15p)¶
In Autumn 2025, the y86-assembly language project involves creating a program that checks whether all numbers in a given sequence are prime numbers, that is, natural number greater than 1 that are not a product of two smaller natural numbers. We can say it in another way: A prime number is a natural number greater than 1 that is divisible only by 1 and itself.
You can find a list of 1000 first prime numbers in wikipedia. We are listing here the prime numbers smaller than 1000:
2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599, 601, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659, 661, 673, 677, 683, 691, 701, 709, 719, 727, 733, 739, 743, 751, 757, 761, 769, 773, 787, 797, 809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863, 877, 881, 883, 887, 907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997
A bit of background¶
The importance of prime numbers¶
Think about the last time you typed a password on a website. Behind the scenes, typography is just pixels… but the connection between your browser and the server is encrypted using keys based on huge prime numbers. Every time you see a lock icon in the address bar (that means that your website is safe), prime numbers are working quietly in the background.
When you type a password or send a message in a chat, that data travels across the network as bits. Anyone on the path could read those bits… unless we encrypt them.
In a very simplified, modern cryptography often works like this:
- Your browser and the server agree on some big numbers.
- Using those numbers, your browser turns your message into something that looks like random noise.
- Only someone with a secret key can turn that noise back into the original message.
Prime numbers come in here because many encryption systems (like RSA) are built on a simple idea:
- Multiplying two big prime numbers is easy.
- Given only the product, splitting it back into those two primes is extremely hard.
Let's see how we can generate a criptographic key:
- We choose two large prime numbers
pandq - We obtain their product
n = p * q - Some extra math done with p, q and n to create:
- a public key (you can share with the world). Based only on the value of
n - a private key (you keep it secret). The private key are obtained upon the two initial prime numbers
pandq
Everyone can use your public key to encrypt data to you, but only you, with the private key, can decrypt it. The security comes from the fact that an attacker only sees
n, and figuring out its prime factors is computationally expensive if p and q are large enough. Summarizing mathematically, criptography is based on:
- Multiplying two big numbers is computatational cheap
- Decomposing a big number in two prime numbers is computational very expensive.
Checking if a number is prime number¶
When numbers are small, deciding if they are prime is simple: try dividing them by the first prime numbers and check if they are divisible by any of them (and hence the number is COMPOSITE, that is, not prime). But as soon as numbers grow, to tenths of digitats this becomes extremely slow even for a computer. Over time, different methods have been developed to check if a number is prime, ranging from basic school-level reasoning to algorithms used in real cryptographic systems.
Let's check few of the methods that has been developed along the years.
- Sieve of Eratosthenes: The Sieve of Eratosthenes is often the first “smart” algorithm students see. Instead of testing one number at a time, we create a list of all integers from 2 up to some limit N. Then we repeatedly pick the next number in the list that is not crossed out, and cross out all its multiples. After going through all the numbers following this process the remaining numbers unmarked are the primes up to N. This method is surprisingly efficient when we want to know all primes in a range. However, it has two main drawbacks for large-scale computation. First, it needs memory proportional to N, because we must store the whole table. Second, it is designed for “all primes up to N”, not for testing a single huge number. When N becomes extremely large, for example something like 10^100, storing such a table is completely impractical. However if the numbers we are dealing with are not very big, we cannot discard this method that basically would consists in creating a list of elements (prime numbers) and check if the element is in the list.
- Exact primality tests: Between school-level methods and fully probabilistic algorithms there are more advanced exact primality tests. These tests always give the right answer, meaning, they are deterministic. They never call a composite number “prime” or the other way around. Examples include more clever forms of division using only primes as candidates, or more theoretical algorithms such as the AKS (Agrawal–Kayal–Saxena) primality test. These methods are interesting from a mathematical point of view and can be used for numbers of moderate size. Still, when we reach the range used in modern cryptography (hundreds or thousands of bits), even many exact algorithms become too slow to be practical in everyday software.
- Probabilistic primality tests: To handle very large numbers efficiently, modern systems often use probabilistic primality tests. These algorithms do not try to find a divisor. Instead, they look at how the number behaves in modular arithmetic and use this behaviour to decide whether the number is composite or probably prime (it is not deterministic). The key point is that they can detect composite numbers very quickly, and when they classify a number as probably prime, the chance of being wrong can be made extremely small by repeating the test with different parameters. This makes them ideal for applications like key generation, where we need to test many large random numbers and do not want to spend minutes or hours on each one.
Fermat Primality Test¶
NOTE: You do not need to fully understand this whole section to complete the exercise. it will give you some interesting background.
A simple example of a probabilistic test is the Fermat primality test, based on Fermat’s little theorem. The theorem says that if
p is a prime number and a is an integer that is not divisible by p, then raising a to the power p−1 gives a number that, when divided by p (or (a^(p−1)) / p) leaves a remainder of 1. Expressed more formally the Fermat Primality test: a^{p-1} \equiv 1 \pmod{p}
The Fermat tests use this theorem and a number
n (odd, larger than 2), chooses a base a with 1 < a < n, and checks whethera^{\,n-1} \equiv 1 \pmod{n}
If this operation does not hold, then
n is definetly composite (and then not prime). If it does hold, we say that n is a Fermat probable prime to base a. However, we cannot ensure that n is prime. There are composite numbers, called pseudoprimes, that satisfy this statement for many or even all choices of a.Miller-Rabin test¶
NOTE: You do not need to fully understand this whole section to complete the exercise. it will give you some interesting background.
Using Fermat theorem, the Millar Rabin primarity test builds a more robust probabilistic tests that improve on the weaknesses of the Fermat test. It adds extra structure so that fewer composite numbers can slip through as probably prime . The Miller–Rabin primality test is a widely used example: it is fast, works very well on large integers and, when combined with a few repetitions, gives an error probability so small that it is acceptable in cryptographic systems.
Miller–Rabin starts by taking the exponent from the Fermat test,
n−1, and decomposing it asn - 1 = 2^s \cdot d
where
d is and odd number. This factorisation is always possible, because every even number can be written as a power of two times an odd number. For example: 20 = 2^2 \cdot 5
Or other example:
144 = 2^4 \cdot 9
If we substitue the new expression in the Fermat test we get that we can compute that:
a^{2^s \cdot d} \equiv 1 \pmod{n}
And knowing that:
A \equiv 1 \pmod{n}
\;\Longleftrightarrow\;
A - 1 \equiv 0 \pmod{n}
We have that the previous expression is also equivalent to:
a^{2^s \cdot d} -1 \equiv 0 \pmod{n}
If we develop the expression:
a^{2^s d}-1
knowing that
(x+1)(x-1) = x^2 - 1
we have that:
\begin{aligned}
a^{2^{s} d} - 1
&\Longleftrightarrow a^{2^{s-1} \cdot 2 \cdot d} - 1
\Longleftrightarrow \left(a^{2^{s-1} d}\right)^{2} - 1\\\\
&\Longleftrightarrow \left(a^{2^{s-1} d} + 1\right)\left(a^{2^{s-1} d} - 1\right)\\\\
&\Longleftrightarrow \left(a^{2^{s-1} d} + 1\right)\left(\left(a^{2^{s-2} d}\right)^{2} - 1\right)\\\\
&\Longleftrightarrow \left(a^{2^{s-1} d} + 1\right)\left(a^{2^{s-2} d} + 1\right)\left(a^{2^{s-2} d} - 1\right)
\end{aligned}
and finally:
\left(a^{\,2^{\,s-1} d} + 1\right)
\left(a^{\,2^{\,s-2} d} + 1\right)
\cdots
\left(a^{\,d} + 1\right)
\left(a^{\,d} - 1\right)
Then we can conclude that:
\begin{aligned}
a^{2^s d} &\equiv 1 \pmod{n}\\\\
&\Longleftrightarrow
\left(a^{2^{s-1} d} + 1\right)
\left(a^{2^{s-2} d} + 1\right)
\cdots
\left(a^{d} + 1\right)
\left(a^{d} - 1\right)
\equiv 0 \pmod{n}
\end{aligned}
It can be deduced that if a number
n is prime, then the product of several terms must be congruent to zero modulo n, what means that at least one of those factors must itself be congruent to zero modulo n.Hence at least one of the following equalities must hold:
a^{d} - 1 \equiv 0 \pmod{n}
or
a^{d} + 1 \equiv 0 \pmod{n}
or
a^{2 d} + 1 \equiv 0 \pmod{n}
or
a^{2^{2} d} + 1 \equiv 0 \pmod{n}
…
or
a^{2^{s-1} d} + 1 \equiv 0 \pmod{n}
In a more compact form, the possible “good cases” are:
a^{d} \equiv 1 \pmod{n}
Or, for some
r with 0 < r < s-1, the following holds:a^{2^{r} d} \equiv -1 \pmod{n}
If any of these equalities holds, the number
n passes the test for this base a (it is probably prime with respect to a). If none of the factors is ≡ 0 mod (n) (that is, we never obtain either 1 or -1 where we should), then the product cannot be ≡ 0 mod(n), and we conclude that n is composite for that base a.This is exactly what Miller–Rabin does when it tests each facto: it checks whether a
1 appears at the beginning ( a^d) or whether a -1 appears somewhere in the chain of powers a^(2^r d).If some of these conditions are met, the number is probably prime. We could use another base to have higher probability of success. However, if none of this conditions is met, we can ensure that the number is composite
Summarizing:
n - 1 = 2^s \cdot d
where
d is odd.Miller–Rabin then tests the sequence:
a^d, a^{2d}, a^{4d}, a^{8d},\ldots, a^{2^{s} d} = a^{n-1}
The rules are:
- If
a^d ≡ 1 mod n, accept (PRIME for this base). - If
a^{2^r d} ≡ -1 mod nfor somer, accept (PRIME for this base). - If none of these happen, reject (COMPOSITE).
Requirements for the project¶
In this project, we are generating random numbers (stored in a sequence in memory). You must check if some of the numbers in the memory-stored sequence are prime numbers. To that end, you MUST use the Miller-Rabin algorithm. This is actually that is widely use to produce RSA keys.
- The project does not require checking whether all numbers are prime. Just return the first prime stored in the sequence.
- The implementation must follow the Miller Rabin algorithm presented before. Other solutions will not be accepted, although you may discuss alternative solutions found online in the project’s reflection section.
- While the y86 architecture is 64-bit, the largest number to be handled has only 10 figures, meaning the full range is not used.
- You may reuse assembly code from the exercises for implementing the algorithm.
The program must have the following features:
- The program starts at the
main:block without any prior initializations. - Do not use .pos 0. We will use that for testing in Lovelace. Just start writing your code. Consider that the first few instructions are for writing code. But its number is neligible.
- The program reads the sequence starting at memory location provided in
%r8(always bigger than0x1800.) - The sequence ends with the number zero.
- The program stops execution when it encounters the first number that is a prime number. This number should be returned in the
%raxregister. - If none of the numbers are prime (all are composite), the program returns the value 0 in the
%raxregister. - The goal is for the program to execute in approximately a maximum of
250000machine instructions for the provided test cases. This is not an absolute requirement; see the guidelines below.
Miller- Rabin algorithm to implement¶
NOTE: You do not need to understand this part
We are implementing a variation of the Miller-Rabin algorithm. In this case, instead of precomputing
We are implementing a variation of the Miller-Rabin algorithm. In this case, instead of precomputing
r and d to test the sequence we are going to use a small trick. We know from Miller-Rabin test that
n - 1 = 2^{s} \cdot d
if we call
pp = n - 1 = 2^{s} \cdot d
We see that
p = 2^{s} \cdot d
\;\Longleftrightarrow\;
\frac{p}{2} = 2^{s-1} \cdot d
Hence, by halving repeatidly the
p we test all the factors in the Miller-Rabin test. Using this shortcut, we could implement the algorithm as follows in C:
NOTE: This is the code you should implement EVEN if you do not fully understand it
Note that the functions
modpow and halve have been implemented in previous exercises. If you do were not be able to solve them, follow the instructions below. #define PRIME 1 /* true = Prime */
#define COMPOSITE 0 /* false = Composite */
/* Convenience typedef for unsigned 64-bit integers. */
typedef unsigned long long u64;
/**** PROTOTYPES***** /
/*
* Calculates the modular exponent of the given "base" raised to the power of
* "exp" and of modulo "mod": base ^ exp % mod.
*
* The modulus must be at least two and the base at least one, otherwise the
* behaviour is undefined.
*/
u64 modpow(u64 base, const u64 exp, const u64 mod);
/*
* Divides the given number by two, rounding down. A simple software
* implementation for right shifting by one.
*/
u64 halve(const u64 number);
/**** FUNCTION IMPLEMENTATION ****/
/*
* Perform one round of Miller-Rabin tests on the primality of the given
* number and using the given witness number as the base. The number to be
* tested must be an odd number at least three, and the witness must be a
* positive number smaller than the number being tested. Otherwise the
* behaviour is undefined.
*
* The result of the test is either PRIME or COMPOSITE (the return value). If
* the test says COMPOSITE, the number being tested is guaranteed to be a
* composite number (no false negatives). However, a result of PRIME does not
* mean the number being tested is a prime number (false positives are
* possible).
*/
int miller_rabin_round(const u64 number, const u64 witness)
{
u64 p = number - 1;
if (modpow(witness, p, number) != 1)
return COMPOSITE;
for (;;) {
if (p & 1)
return PRIME;
p = halve(p);
const u64 result = modpow(witness, p, number);
if (result == 1)
continue;
else if (result == number - 1)
return PRIME;
else
return COMPOSITE;
}
}
You must repeat the same algorithm for each one of the following witnesses:
2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37The witness numbers are chosen as they are the minimal set of bases for Miller-Rabin who can determine weather a number is prime or composite without checking other bases.
Important aspects to consider:
- The algorithm supposses that the number is odd, so it is highly recommend to check first if the number is odd. If the number is even, then is directly composite (can be divided by two).
- 0 and 1 cannot be tested (they are not either prime or composite)
- If one of the witnesses does not hold, then we can ensure that the number is composite, hence, we do not need to test more witnesses.
- ALWAYS the following condition must hold:
number >= witness - You should have implemented the modpow (modular exponentiation) in a Previous exercise. The exercises have also asked you to implement an efficient divide by two. If you did not manage to implement in the exercise, try to write a subroutine that is working here. Try to find a subroutine in internet or create it with AI (BUT PLEASE, DISCUSS THIS IN THE CODE COMMENTS, OTHERWISE IT WILL BE CONSIDERED PLAGIARISM). Other option is to just create an empty subroutine including nop and explain what you have tried. You are not going to have full grade, but at least your code might work and meet the minimum requirements to be accepted. Eg.
# Explain here what you have tried and why do you think your code is not working.
modpow:
nop
nop
nop
ret
Testing¶
Please, do not use Lovelace for testing, you can create your own list of numbers to check. This might be an example code that you can add for testing
.pos 0 irmovq $0x1800, %r8 .pos 0x1800 .align 8 .quad 3 .quad 7 .quad 0
You can use some list of primes in internet
When you upload the code to Lovelace, remove the test. Remember, in your final code, you should not include the .pos 0 . It is added by Lovelace to write the code.
Guidelines for the Project¶
- The project must be completed independently.
- While discussions with others are allowed, the code must be written individually.
- The project may be submitted incomplete.
- Bugs are acceptable, and the project does not need to pass all test cases completely.
- For the project to be accepted, the submitted code must show genuine effort and should work at least partially. Code that is still in its initial stages will not be accepted.
- The submitted assembly code must be well-documented with comments.
- Plagiarism rules are the same as for the other modules.
- In the comments, list others with whom you discussed the project to avoid suspicions of plagiarism due to similar solutions. Same
- You are not allowed to submit another student's code as your own.
- Same applies if you are using AI. YOU MUST CLEARLY INDICATE HOW YOU HAVE USED IT.
Implementation¶
The project can be implemented by reusing your own code from the exercises. However, this might result in an inefficient solution since large numbers will lead to a quick increase in instruction count due to multiplication, division, and other opreations are not optimized. Optimization opportunities can be explored here.
A good approach is to first create a working version of the program without worrying about the number of instructions. Once this version works, optimization can be considered.
The instruction count can be estimated by reviewing the code and manually calculating the number of instructions executed during runtime. For example:
- Loops and conditional branches should be taken into account.
- The code can also be run in a simulator, which provides an exact count of instructions executed. Instructions for installing a simulator are available in the assembly exercise section.
- There are several y86 simulators available online, such as this 32-bit y86 simulator. Note: In this simulator, the instruction suffix q must be replaced with l (e.g., irmovq -> irmovl).
- This year, Lovelace would provide the number of instructions. But please, do not use Lovelace just to check the number of instructions.
In the reflection section of the project, consider how the program could be made more efficient. For instance:
- Try to estimate the amount of instructions executed by each one of your subroutines or significant amount of code. Provide a justification for your estimate.
- If the program exceeds 1 500 000 instructions, reflect on this issue.
- Identify any clear bottlenecks in your implementation.
- Could any of the current operations be optimizez
- Is there an elegant single-instruction solution for bit shifting left (towards MSB)?
- Could loops/subroutine calls be optimized?
- What optimization choices did you make, and what was their impact (e.g., on instruction count)?
- etc.
Assembly Programming Guidelines¶
Programming in assembly is more challenging than high-level languages, so a few general tips:
- Plan your program thoroughly, just as you would for a high-level language.
- A "try and error" approach doesn’t work for more complex assembly programs.
- Using concepts from "structured programming" is beneficial. Divide the program into functional sections that can be implemented as subroutines and combined into a working program.
- What subroutines are needed? How should they be placed in memory to minimize jumps/branches? How are parameters passed?
- Could any code from the exercises be reused?
- Assign consistent purposes to registers in advance (and stick to them):
- Some registers are only for "global variables."
- Some registers are for "temporary variables."
- Some registers are for subroutine "parameters" (should the stack be used?).
- Using the stack:
- The stack grows "downwards" in memory, so ensure it doesn’t overwrite the code...
- Extra data can remain on the stack at the end of the program, as long as the required result is at the bottom (or topmost in memory).
- Test your algorithm on paper or implement it in C to understand the intermediate results assembly code should produce.
- In the C code, you could name variables after registers.
- Test your code in a simulator.
- Test subroutines as standalone programs.
- Test conditional statements similarly. Use all possible result types (e.g.,
<0, ==0, >0). - For jumps, it’s often sufficient to examine the
ZFflag (is the result zero or not zero?). - If your program exceeds 10,000 instructions, you can increase the maximum instruction limit using a simulator flag.
- For example, run the simulator with the command
ssim fibonacci.yo -l 50000for a 50,000-instruction limit. - The code must be well-documented with comments.
- Test this by stepping away from the code for a few days, then returning to it.
- Debugging/testing your program can be done by inserting
haltinstructions at appropriate locations, then checking the intermediate results/state in the simulator. For example, after implementing part of the algorithm, stop execution with ahaltcommand and inspect registers/memory. You can also usebrkinstructions to create breakpoints if you use the recommended online simulator. However, be sure that they are removed before sending the file to Lovelace. Lovelace do not recognize those. - It's definitely worth asking for help from the exercises or the Discord chat if needed. The staff is hired to assist you.
Grading¶
Minimum requirements¶
- Program must compile and not have runtime errors
- There should be clear proof that students tried to implement the provided algorithm. If it is not working add comments indicating what each subroutine is doing, what is failing and which are the possible causes.
- Students must get at least 5 points in the project, so it can be considered accepted.
Grading in detail¶
The project is worth a maximum of
15 points, distributed as follows:- A functioning y86-assembly program meeting the specifications:
0–9 points - Submissions exceeding 1 500 000 instructions will be accepted (even if the test wont continue in Lovelace) , and the program may include bugs. Instruction count influences points, and optimization options should be discussed in the reflection section.
- The submitted code must demonstrate genuine effort to solve the problem. At minimum it should be able to be executed without errors in the emulator.
- Maximum number of instructions accepted in each Lovelace test: 1 500 000
- Well-documented and readable assembly code with subroutine calls and consistent register usage:
0–3 points - Reflection on your implementation:
0–3 points - Address the reflection questions above. Runtime instruction count (or an estimate) must be included.
- Note: Reflection alone is not sufficient for passing the project!
- The reflection section should be 10–15 lines in the comments of your code file. Example:
# TKJ Project 2025 Prime number # Olli Student, email # # Reflection: # - Runtime instruction count: xxxx # - I optimized the program using method x.. # - ... # Code starts here main:
More specific Assessment guidelines are provided in the return box. In this module, there is no feedback meeting.
NOTE:
Grading takes into consideration, among other things, the number of instructions used, the number of test that succeed and other aspects such as code structure, comments and reflection.
Grading takes into consideration, among other things, the number of instructions used, the number of test that succeed and other aspects such as code structure, comments and reflection.
We have estimated the amount of steps that the most heavy test might need and define a few threshold to check the optimization of the code.:
- Maximum number of instructions: 1 500 000. More than that the checker will fail.
- 1 500 000 > number of steps > 1 000 000. Easily optimizable code
- 999 999 > number of steps > 250 000. Code OK.
- number of steps < 250 000. Highly optimized code.
Submission¶
The project deadline is 2025-12-22 23:59