CS202 -- Lab 6 (Spring 2024)


Inspiration

An integer represents a prime number, if the number is greater than 1, and its only divisors are 1 and the number itself. Non-prime numbers are composite in that they are the product of powers of primes.


Given any number N, you can check if its prime by dividing it by smaller natural numbers. Further, you don't even have to consider all of them; its easy to show that you need to go up to and no higher than sqrt(N) since that will be the largest possible divisor.


Primes are important in several routines you may use later in your time at UTK, for example, in public-key cryptography. Primes are also cool for a bunch of mathematical reasons, for example, the unproven twin prime conjecture states that the number of primes that differ by two (i.e., 5 and 7; 11 and 13; 41 and 43) is infinite. The largest known prime has over 24.8 million decimal digits, discovered by a FedEx employee who lives in Germantown, TN. Finding large prime numbers has even led to financial awards given how important they are.

Part 1: Finding primes

Develop a simple C++ program called "primes1.cpp" that takes a single command line argument, and displays the prime numbers between 0 and the number provided, 20 per line.

For example, calling your program as follows:

./primes 100

should output the following:

2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71 
73, 79, 83, 89, 97

It would be good practice to develop a separate procedure "isPrime(x)" in whatever approach/procedure you'd like.


Hint: You can use the modulo operator (%) to check for clean division since 0 will be the result if it is not prime.

Part 2: Practicing using STL search functions

Develop another simple C++ program called "primes2.cpp" that caches known primes using an STL container of your choice (e.g., vector). This program, like others you wrote this semester, should take a series of numbers from standard in and report if they are prime or not prime as follows:

100
5
4
3
2
1
should report
not prime
prime
not prime
prime
prime
not prime

There is a wrinkle to this new program, however. Whenever the first number is provided, or any number larger than the previous max, determine the primes between 2 and this number (or the old max and the new number) and store them in the container. So, for example, the STL container after 100 is provided should contain the 25 values listed above. For any value smaller (e.g., 5), use find() from the STL to see if it is in the list of known primes or not.

Part 3: Determining the performance of your solution

There ** might ** be more efficient ways to find a previously cached prime number than find(). Similar to the hash analysis/challenge we did in class, prepare and submit a report.txt that does the following:

  1. Time and report "make test" at least three times using find() from the STL
  2. Similar to the bitset labs from CS102, try using a very simple vector of v size n where v[i] == 1 if i is prime and report at least three runtimes
  3. Report the "load factor" of your solution with 2 million primes, i.e., how many entries in v are 1 (prime) out of the 2 million total possibilities?
  4. It is highly likely that our vector with actual primes is sorted, if we add primes in increasing order. Swap find with binary_search from the STL.
  5. You may be thinking to yourself, "Self, what about a hash table!?" Although not a bad idea, this is highly dependent on both hash table size and finding a great hash function. In practice, and especially in interpreted languages such as Perl and Python, you will often use an associative container that we are covering in class (map in C++, hash in Perl, dictionary in Python) that is a more space efficient data structure than either vector above but has a worst case search time similar to binary search. For your final analysis, replace the vector implementation from #2 with a map (int, int) (or map(int, bool)) and compare the runtime of this approach to the original run-time from part 2 (#1).

Extra credit

In a historical Dr. Plank style 140/302 students would be asked to complete TopCoder challenges to further hone their coding skills, after which a TA would provide their solution on a high level (no actual code usually). There are quite a bit on primes, but I want you to "geek out" further. We will award 2 extra credit points to students who win one of the challenges below. Note they are different, i.e., a well done attempt at #2 probably will not win #1. Let us know if you have questions or concerns.

  1. The student who finds the most primes. A great start to this is to look at the ancient (but highly efficient for relatively small n) algorithm called the Sieve of Eratosthenes that is over 2,000 years old. This takes O(n log log n) time, but there is a pretty big constant hiding there (see reading(s))
  2. The student who finds the largest prime. Although not guaranteed to "win," a great place to start is to consider Mersenne primes in combination with your part 2 solution.

Rubric

+ 2 primes1.cpp and primes2.cpp are well formatted, commented (inc. name, assignment, 
and overview), with reasonable variable names
+4 Matches the target output for 100 primes for part 1 (see above)
+11 Passes make test for part 2 (primes2.cpp; 1 point per test)
+14 Report (report.txt) has the requested values. Namely:
- make test times (n=3) using find (3 points, 1 per value)
- test times (n=3) using a vector of size max value, with v[i] == 1 if i is prime
- Provides the correct # of primes / load factor (2 points)
- test times (n = 3) using binary_search from the STL instead of find (1 point per value)
- test times (n = 3) using either a map or a set instead of a vector (1 point per value)

Testing your code prior to submission

To faciliate testing, you were previously asked to clone the course Github repository as follows:

git clone https://github.com/semrich/CS202-22.git cs202

For this assignment, update this clone by using the following:

git pull

Submission

Name your solution to part 1 "primes1.cpp," the solution to part 2, "primes2.cpp," and your final solution to part 3 that uses a map or a set as "primes3.cpp." Name your report "report.txt." Submitting part 3 is optional as mentioned in class.

Then, please run the following command prior to submission:

tar -cvf lab6.tar primes1.cpp primes2.cpp report.txt

Note: Although submission will be faciliated by Canvas, we will compile and test on EECS lab machines!

If you develop your solution elsewhere please make sure it works on the lab computers prior to the deadline.